摘要： Being a data scientist is hard. In addition to the combination of advanced mathematics and coding skills required to do the job, it’s a newer role for many organizations, so data scientists are called upon to navigate corporate landscapes, source the right IT resources, and establish new workflows across departments
These best practices will help data leaders be more effective at their jobs, lead the way for future data scientists, and establish a department that’s innovative and productive.
OPTIMIZE USE OF OPEN-SOURCE
Because open-source tools are such an important part of the data science technology stack, it’s important that hiring criteria reflect this.
Data scientists with experience contributing to open-source projects will have a better understanding of how to evaluate and manage open-source tools by looking at code activity, package metadata, release history, and project contributors.
They should also understand when and how to make pull requests if packages can be updated, enhanced, or made more secure to meet an organization’s needs. In addition to hiring data scientists and developers with open source expertise, consider working with a vendor that provides support for open-source tools and libraries.
INSTITUTE A SECURITY-AWARE CULTURE
When data scientists don’t monitor for potential threats, vulnerabilities inevitably creep into models over time. Data science leaders must step up and collaborate with IT and security leaders to take charge of their data science and machine learning pipelines.
Because these pipelines usually involve the use of open-source libraries, it’s important to understand an organization’s risk tolerance for open-source software. Learn about Common Vulnerabilities and Exposures (CVEs), how to look for them, and how to monitor environments for high-risk packages. Ignoring a high CVE score can result in data breaches and unstable applications.
DEVISE A TEAM STRUCTURE THAT MAXIMIZES IMPACT
Many data scientists don’t start out on teams, rather they are scattered across the organization and assigned to specific lines of business to solve particular problems.
This is usually effective for organizations starting their data science journeys because it’s easier to demonstrate business impact with small, focused projects. But over time, data scientists will need to collaborate to develop processes and eliminate redundancy. They will also need to work with IT to understand how to put projects into production, assess the limits of their resources, and understand security standards.
Many organizations have found success in adopting a hub-and-spoke model, in which some data scientists remain within lines of business, while others work in a data science lab or center to help data scientists and analysts across the organization.
詳見全文Full Text： Dataconomy
若喜歡本文，請關注我們的臉書 Please Like our Facebook Page： Big Data In Finance