2020.2 - Data Release Log
over 4 years ago by Mark Hess
Overview
February saw the removal of a large quantity of erroneous data generated by our web crawling apparatus. Most of the data came from one particular source
Record Counts
The following are the updated record counts for each vertical:
- Business data increased by 4.2 million to 97.7 million
- People data stayed at 11.7 million
- Product data decreased by 29.3 million to 156.3 million
- Property data increased by 15.6 million to 86.4 million
New Sources
The following data types have received new sources:
- 2 Business data sources
- 2 Product data sources
- 1 Property data sources
Schema Changes
asins
field is now a single value- Legacy fields finally removed such as
crawlResultFiles
andautomatedCollection
- Additional
paymentTypes
have been added as well as acceptablehours
Fixes
- Postal codes are being collected more accurately as issues regarding our
- Just under 40 million product, and some business, records were removed due to erroneous data collected.
- Updated 1 Business data sources, and 3 Product data sources