An unused AI system delivers no value. It’s essential to develop a production process that smoothly transitions the AI systems you have in development to live use. Below, we describe an optimal production pipeline – in which rapid iteration, appropriate hardware, suitable hosting, rigorous testing and ongoing maintenance deliver high quality results.
By the time you are considering how to take your solution live you should:
The aforementioned will enable you to determine an optimal process to move from development to a live solution – the production environment.
Progressing an AI system from idea to reality should follow a broadly conventional development practise – although timescales may be less certain. After ideation, undertake research, prototype and then develop a minimum viable product (MVP). Once in production undertake cycles of ideation, research, development and quality assurance.
Source: MMC Ventures
Whether you have an in-house team or are outsourcing, ensure the team understands the characteristics required from the end system. Which considerations are flexible? Which are not? Different decisions will be made if speed is more important than accuracy – and vice versa.
Even in the research phase, ensure that development is undertaken in the language used for deployment. If you plan to deploy in Python, for example, avoid the overhead of rewriting models created in MatLab or R.
Initially, optimise for speed over quality. It’s better to release an early version of a model from the research environment into production, and then to solicit feedback in the live environment, than to wait until the research model is perfect. “Spend a month to get a weak model and then iterate to make it great” (Eddie Bell, Director of Machine Learning, Ravelin). Isolating models within the research environment will push considerations of deployment, usability, performance and scalability to the end of the project instead of addressing them early. In addition, it increases the risk of a model performing poorly with unexpected real-world data. Many data scientists resist releasing models that are not ‘good enough’. Overcome this hurdle by developing a culture in which the dynamics of AI development are understood and people are not blamed for early, poor quality results.
Effective research & development requires appropriate hardware – see page 55 for guidance.
Ensure your AI team has its code in a source control system – Git, Mercurial and Subversion are popular – and update it regularly. The size of trained models can exceed file size limits on these systems. If file size is a constraint, find an alternative way of versioning and storing your files. A simple solution (for example, creating zip files on a shared drive) can be effective but ensure these files are regularly backed up to prevent accidental deletion or changes breaking your AI models.
Your research team may find that it is creating many similar models – for comparable problems or multiple clients. Automate repetitive tasks to as great an extent as possible, with your Research team validating the results and using their specialised skills to adjust the network architectures.
During the research and development phase, your non-AI development and production teams should take the AI models you have in development and insert them into environments in which the models may be used.
These early prototypes will be incomplete and unfriendly for users, but will show the capacity for AI to solve the problem. Before your system can become a minimum viable product (MVP), prototypes will highlight related development work required – including website changes, the creation of database connections, mobile application modifications or development of application programming interfaces (APIs). Prototyping will engage stakeholders, allow other applications to call the model, enable initial scrutiny of results, and serve as a starting point for improvement.
During the prototype phase it is critical to solicit feedback from people outside the AI and production teams. Begin with internal stakeholders and, with each improvement, move closer to feedback from end users. Are your models:
Answering these questions early will avoid expensive redevelopment later. As with developing non-AI systems, frequent and iterative changes offer flexibility to address difficulties as they emerge.
Before your team completes the research and development iterations that feed your prototypes, finalise plans for a release process and for deploying code to its final environment. The number of stages in this process, and its complexity, will depend on factors including: the importance of controlling the code (processes for code review, testing, code merging, build, and versioning); the implications of system downtime; and the level of automation you require.
Considerations are company-specific – but evaluate:
If you have existing development practises, adopt these to the extent possible to ensure that AI is not considered separately from the rest of your team’s development efforts.
Whileautomating release based on certain metrics may be straightforward, understanding whether a new AI system is an improvement overall may be difficult. A new version of your AI system may offer improved accuracy at the expense of speed, or vice versa. Whether you are automating deployment or verifying it manually, prioritise what is important to your use case.
“Spend a month to get a weak model and then iterate to make it great.”
Eddie BellRavelin
With an initial model, supporting code and an established deployment process you should have a minimum viable product (MVP) ready for release to your production (live) environment. The MVP is distinct from your prototypes – while imperfect, it will contain all the elements required to solve the problem including peripheral components (web interfaces, APIs, storage and versioning control).
Having a model in production does not mean it needs to be publicly visible or impact live results. It should, however, be exposed to live data so your team can make refinements until it meets the requirements for a major release. With live data you can undertake longer-running tests and provide your data science team with feedback on what is working well and what is not. At this stage, prioritise establishing a controlled release process with thorough code testing, and the stability of your solution. You should also monitor the performance and scalability of your system.
Plan continual cycles of improvement – investigate and implement ideas for iterating the model, changing the interface and responding to feedback. New models must be demonstrably superior to old ones. Test all changes before updates are released to the production environment, allocating work between the AI team and the general develop- ment team. These cycles will continue for the life of the system.
If you’ve yet to decide where your system will run – on premise, in a data centre or in the cloud – at this point you will have the information you need to select an environment, and hardware, that are suitable for your needs.
On-premise: If your data is highly sensitive and cannot leave your network, or you wish to keep data and inferencing entirely under your control, you may wish to host your AI systems within your own premises. Usually, this is possible only for companies that have their own internal hardware infrastructure already. This can be a highly cost-effective option if the volume of requests to manage is known and relatively stable. However, all new hardware must be ordered and provisioned, which will limit scalability. Further, security will be entirely your responsibility. As such, on-premise deployment is a less preferred option for early stage companies that will lack these specialised skills.
Use on-premise if you: |
---|
Need to fix your costs |
Have existing on-premise hardware |
Are working on highly sensitive data |
Avoid on-premise if you: |
Do not have robust in-house security expertise |
Cannot guarantee volumes of requests |
Need your models to be accessed from outside your network |
Source: MMC Ventures
Data centre: If you can afford the capital expense of buying servers, and have limited need to scale rapidly, hosting your own hardware in a data centre – either your own or a third party – can be an attractive option. The cost, even over a year, can be far lower than using a cloud service and you will maintain control over your system’s performance. Using a data centre can also be sensible when you already have large volumes of data on your own servers and wish to avoid the cost of uploading the data to a cloud service.
The capital expense of establishing and managing a data centre can, however, be high – although for early stage companies there are programmes, such as NVIDIA Inception, which offer discounted hardware. As with the on-premise option, only consider a data centre approach if your company already has its own data centre for other servers, as well as staff with skills to install and configure your hardware. In addition to the upfront cost, the distraction of managing a data centre may prove an inappropriate and unwelcome distraction for your early stage company focused on its core initiatives.
Cloud: For good reason, many early stage companies choose cloud hosting from the outset. Amazon AWS, Google Cloud, Microsoft Azure and Rackspace are popular cloud providers. The majority of cloud providers offer specialised options so you can begin quickly and need set up little more than a security layer and links to other systems in their cloud. Further, time-based costings allow rapid upscaling and downscaling of resources as required. For companies without dedicated system administrators, cloud may be an easy choice. You will, however, pay a premium for the service. A year of cloud hosting can cost twice as much as hosting in a data centre. You will also pay to transfer data in and out. Nonetheless, costs are payable monthly, rather than as a single large capital expenditure, and you will also avoid the cost of staff to manage the hardware.
Use this approach if you: |
---|
Wish to fix your costs |
Have existing data centre hardware |
Seek control over your data |
Avoid this approach if you: |
Require flexibility in your resourcing |
Wish to avoid high up-front capital costs |
Source: MMC Ventures
Unless there is a compelling reason to do so – cost, location or you are replacing a supplier – it is usually desirable to use the same cloud provider for your AI hosting that you use for your other infrastructure. This will limit data transfer costs and provide a single infrastructure to manage for security and resilience.
Although cloud systems offer extensive physical security, be aware that you are placing your software and data on the internet. If you do not secure the cloud servers that you establish, you will be at risk. Your cloud provider should ensure that: access to their data centre is secure; their data centre has multiple power sources and internet connections; and there is resilience in all supporting infrastructure such that the provider can resist any direct security challenge either in person or via attempted hacks into their network. They should also ensure that the data images they provide to you are secured from the rest of their infrastructure and other customers’.
It is your responsibility, however, to ensure that your systems on their infrastructure are secure. Direct access to your account should only be via multi-factor authentication – not a simple username and password. Data stored should be private and any external data access or calls to your AI must be established using best practices for authentication. There are many malicious individuals who scan the IP addresses registered to cloud providers, looking for unsecured systems they can exploit. Finally, consider the physical location in which your cloud servers are hosted. Different countries have varying rules regarding data and hardware. You may need to keep your data within its area of origin. Be aware of local laws that could allow the cloud servers to be restricted. US law for example, allows hardware from a cloud provider to be seized if authorities suspect its use for criminal activity. If you are unlucky enough to have data on the same physical system, you could lose access to your systems without notice. This risk can readily be mitigated with appropriate monitoring of your remote systems and images of your servers that you can start in other zones if required. Finally, different regions may have varying performance at different times of day – a dynamic you can use to your advantage.
Use this approach if you: |
---|
Need flexibility in resource |
Have existing systems and data in the cloud |
Have limited capital to get started |
Avoid this approach if you: |
Already have systems and personnel established in a data centre |
Use highly sensitive data |
Source: MMC Ventures
“For good reason, many early stage companies choose cloud hosting from the outset.”
Proving that new AI releases are effective, and an improvement on prior versions, differs from the typical software quality assurance (QA) process. Test your AI system at multiple stages:
”Accuracy” has a specific meaning in AI – but, confusingly, is also used as a general term to cover several measures. There are three commonly-used measures of accuracy in AI: recall, precision and accuracy. Understand these measures to decide which are important for your systems so you can validate them appropriately.
Consider an AI that determines whether an apple is ‘good’ or ‘bad’ based on a picture of the apple. There are four possible outcomes:
1. True positive: The apple is good – and the AI predicts ‘good’.
2. True negative: The apple is bad – and the AI predicts ‘bad’.
3. False positive: The apple is bad – but the AI predicts ‘good’.
4. False negative: The apple is good – but the AI predicts ‘bad’.
Using the example above:
Avoid the temptation to use a single measure that flatters results. You will obtain a truer picture by using all three measures.
Balancing precision and recall can be difficult. As you tune your system for higher recall – fewer false negatives – you will increase false positives, and vice versa. Whether you elect to minimise false negatives or false positives will depend on the problem you are solving and your domain. If developing a marketing solution, you may wish to minimise false positives. To avoid the embarrassment of showing an incorrect logo, missing some marketing opportunities may be acceptable. If developing medical diagnostics, on the other hand, you may wish to minimise false negatives to avoid missing a diagnosis.
Automate testing to as great an extent as possible. Every new model should be tested automatically. “Efficiency is critical. If you have to do something more than once, automate it.” (Dr. Janet Bastiman, Chief Science Officer, Storystream). If all measures of accuracy are higher, the decision to deploy the new model will be straightforward. If measures of accuracy decrease, you may need to verify the new model manually. A decrease in one measure of accuracy may not be problematic – you might have re-tuned your model for precision or recall, or decided to change the entire model to improve performance. If your models produce results that are concerning, speak to your AI team to discuss why. It may be that your training data set does not contain enough appropriate data. If you encounter problems, add examples of these types of data to your test set so you can monitor improvements.
A deployed AI solution reflects a point in time; available data, business requirements, market feedback and available techniques will change. Beyond the typical maintenance you would perform on any software system, you need to verify and update your AI system on an ongoing basis. Once your solution is live, ensure it continues to perform well by:
AI technology is developing at pace. Further, the varieties and volume of available training data continue to evolve. Invest in continual improvement to ensure the system you develop today avoids obsolescence.
“Available data, business requirements and techniques will change over time. Invest in continual improvement to avoid obsolescence.”