Web Of Science (WoS)¶
The Knowledge Lab has licensed the Thomson Reuters’ Web Of Science XML data for access to lab members as well as collaborators within the MetaKnowlege network. We maintain the raw XML data as well as a curated relational MySQL compatible database for different modes of analyses.
Here are some quick statistics about the wos2 database:
- The database contains publications from 1960 to 2015.
- The publications table contains 57M records.
- The references table contains 1.08B records.
The data is currently stored on two systems. The raw xmls are on S3 :
- Complete dataset on S3 Storage : s3://klab-webofscience
- Sample dataset on S3 Storage : s3://klab-webofscience-sample
The parsed relational data is hosted on 2 RDS servers.
- AWS Mysql compatible database : wos.cluster-cvirc91pe37a.us-east-1.rds.amazonaws.com
- Updated MySQL database : wos2.cvirc91pe37a.us-east-1.rds.amazonaws.com
The schema for the updates MySQL database :
The data is secured within the Cloud Kotta secure data enclave and is only accessible from within the network. For access privileges please contact the Administrators for access.
Even derivatives from this dataset could be considered sensitive. So users are adviced to confirm with the Administrators before publishing jobs that have derivatives in the output set.
None that are public.