摘要： Veeva OpenData Explorer is a new web-based portal to access approximately 16 million healthcare professionals (HCPs), healthcare organizations (HCOs), and their affiliations spanning 34 countries. The open API simplifies integration of Veeva OpenData with third-party applications and services so companies can leverage their customer data where they need it. With these latest innovations, Veeva is giving customers greater choice in how they use Veeva OpenData and making it even easier to access accurate customer data.”
摘要： Netflix has open sourced Metaflow, an internally developed tool for building and managing Python-based data science projects. Metaflow addresses the entire data science workflow, from prototype to model deployment, and provides built-in integrations to AWS cloud services.
摘要： In the banking or pharmacy industry where regulations compel companies to have good governance in place, in industries such as publishing and telecom Data Governance often seems complicated and theoretical. That’s according to Sara Willovit, Product Data Governance at Becton Dickenson.
摘要： Generative AI language models like OpenAI’s GPT-2 produce impressively coherent and grammatical text, but controlling the attributes of this text — such as the topic or sentiment — requires architecture modification or tailoring to specific data. That’s why a team of scientists at Uber, Caltech, and the Hong Kong University of Science and Technology devised what they call the Plug and Play Language Model (PPLM), which combines a pretrained language model with one or more attribute classifiers that guide novel text generation.
摘要： Data can be anywhere. Companies store data in the cloud, in data warehouses, in data lakes, on old mainframes, in applications, on drives — even on paper spreadsheets. Every day we create 2.5 quintillion bytes of data, and there are no signs of this slowing down anytime soon.
摘要： While much work in data science to date has focused on algorithmic scale and sophistication, safety — that is, safeguards against harm — is a domain no less worth pursuing. This is particularly true in applications like self-driving vehicles, where a machine learning system’s poor judgment might contribute to an accident.
摘要： The pandas library offers core functionality when preparing your data using Python. But, many don't go beyond the basics, so learn about these lesser-known advanced methods that will make handling your data easier and cleaner.