Dotscience emerged from stealth with its platform for collaborative, end-to-end ML data and model management.
By giving teams the unique ability to collaboratively track runs — a record of the data, code and parameters used when training an AI model — Dotscience empowers ML and data science teams in industries including fintech, autonomous vehicles, healthcare and consultancies to achieve reproducibility, accountability, collaboration and continuous delivery across the AI model lifecycle. The Dotscience platform is now available as SaaS or on-prem, and on the Amazon Web Services (AWS) Marketplace in August.
"The current state of AI development is a lot like software development in the 1990s. Before the movement called DevOps, modern best practices such as version control, continuous integration and continuous delivery were far less common and it was normal that software took six months to ship. Now software ships in minutes," said Luke Marsden, founder and CEO of Dotscience. "At Dotscience, we are applying the same principles of collaboration, control and continuous delivery of DevOps to AI in order to simplify, accelerate and control AI development."
Dotscience provides a tool that manages the complete AI lifecycle by empowering data scientists and ML engineers to work in ways in which they are familiar. Data science and ML teams can take advantage of a platform that is easy to use and provides a single place to collaborate on, develop, test, monitor and deliver their ML projects.
"In practical terms, and unlike other offerings on the market, this means that teams can continue using the same development tools, ML frameworks, languages, data sources and compute instead of being forced into a walled garden which risks vendor lock-in and steep learning curves," said Mark Coleman, VP of Product and Marketing at Dotscience. "Because Dotscience tracks and packages together every run that goes into the data engineering and model creation process, users can replicate each other's work, collaborate easily and track back as needed."
Dotscience offers data science and ML teams the following key benefits:
- Seamless flexibility and integration all from one platform: Dotscience users can easily attach any compute to the platform, whether it is their own laptop, cloud-based VM's or on-prem bare metal. After a user then trains a model, Dotscience integrates with continuous integration and monitoring tools so that they can deploy and then monitor the models in production, keeping all relevant information in one place.
- Optimal team productivity: By providing an automated ML knowledge base to eliminate silos, Dotscience removes the key person risk, making it easy for any data scientist or ML engineer to pick up where another left off––an attribute that is especially important in todaoday's competitive hiring landscape. Dotscience allows teams not only to collaborate seamlessly but also to discover previous work and see exactly how it was built by tracking every version of every element in the model development phase.
- Flexible access to compute, hybrid cloud portability for ML development environments: Team members can start working on their laptop, then move their AI workload to a bigger cloud machine or a bare metal GPU rig when they need extra power, all seamlessly and without having to create a support request. The entire package of code, data, environment and hyperparameters that are needed to reproduce the development environment is bundled up and packaged together in such a way that moving from one cloud to another or on-prem is seamless.
- Ability to work with data from any source: Dotscience works with flat files stored directly in Dotscience, data in remote object storage (i.e., S3 or S3-compatible, Azure or GCS) and data from SQL, NoSQL and Spark data lakes. This flexibility allows data science and ML teams to get started immediately with whichever data sources are already in use. Dotscience doesn't force the ingest of all data; it can track the provenance of data where it already exists, given a compatible object store.
- Allows AI and data science teams to use the tools they care about, while removing the obstacles that aren't central to productivity: Using Dotscience's tracked workflows, data scientists and ML engineers can use open source tools for model training with which they are familiar and love, such as PyTorch, Keras and TensorFlow. They can use Jupyter notebooks natively in the application or choose to work on the command line enabling them to use any IDE of their choice.
- Guarantees compliance with current and future regulation: ML models are used to make decisions by design, but if decisions that are made are incorrect, it can lead to serious financial, reputational and legal risk. Dotscience both monitors ML models to detect issues early and also makes it possible to forensically reproduce any issues that occur so they can be quickly addressed and fixes confidently deployed.
Dotscience provides end-to-end ML lifecycle management without forcing users to change their working practices and this approach also extends to the installation options.
Customers can choose to deploy the hosted SaaS and bring their own compute, or install a fully private version of Dotscience either manually, or through the Dotscience installer in the AWS Marketplace which will be available in August. Installers for Microsoft Azure and Google Cloud Platform will soon be available as well. This flexibility means that a broad userbase can access an integrated ML platform that provides unified version control and collaboration for data scientists.