golden source for combining boycott data + alternatives
Find a file
2024-02-19 23:45:47 +00:00
.github/workflows Clean up files/directories, add auto-formatting (#11) 2024-01-20 22:19:51 -05:00
data support more categories 2024-02-04 16:22:22 +00:00
output support more categories 2024-02-04 16:22:22 +00:00
raw add new raw data sources, update import script 2024-02-04 16:00:47 +00:00
schemas support more categories 2024-02-04 16:22:22 +00:00
scripts add new raw data sources, update import script 2024-02-04 16:00:47 +00:00
.gitignore Recenter around brands instead of companies. Consolidate brands with alternative_brands. Add schema validation 2024-01-07 14:20:12 -05:00
.pre-commit-config.yaml Clean up files/directories, add auto-formatting (#11) 2024-01-20 22:19:51 -05:00
README.md Clean up files/directories, add auto-formatting (#11) 2024-01-20 22:19:51 -05:00
requirements.txt Minor tweaks to import script 2024-01-28 13:52:31 +00:00

Ceasefire Now

boycott-israeli-consumer-goods-dataset

Collating all consumer boycott and alternatives data into a single, golden-source, version-controlled repository. This can be consumed by software products and services.

If you're a company looking for SaaS products to avoid, see TechForPalestine/boycott-israeli-tech-companies-dataset.

Sources:

Data

All data is inputted & stored as YAML files in the data/ directory. Output formats, such as CSV and JSON are in the generated/ directory.

Schemas for the YAML data can be found in the schemas directory, along with descriptions for each field. These schemas are in JSON Schema format, but represented in YAML for simplicity. The scripts/validate_yaml.py script validates all brands and companies using the schemas.

TODO

  • high level location to country codes? https://www.iban.com/country-codes
  • update empty parents (manual?)
  • review the data + sanitize
  • validation script - read everything in data/companies, check for dupes, check required + optional fields etc.
  • script to read in data/companies in and generate full data in csv, json, yaml format + add timestamps.