To create an effective machine learning and deep learning model, you need more data, a way to clean the data and perform feature engineering on it. It is also a way to train models on your data in a reasonable amount of time. After that, you need a way to install your models, surveil them for drift over time, and retrain them as required.
If you have invested in compute resources and accelerators such as GPUs, you can do all of that on-premises. However, you may find that if your resources are adequate, they are also inactive much of the time. On the other side, it can sometimes be more cost-effective to run the entire pipeline in the cloud, applying large amounts of compute resources and accelerators and then releasing them.
The cloud providers have put significant effort into building out their machine learning platforms to support the entire machine learning lifecycle, from planning a project to maintaining a model during production. What are the capabilities every end-to-end machine learning platform should provide?
If you have the extensive amount of data to create precise models, you may not want to ship it halfway across the world. Distance isn't a problem here; however, it's about time. Data transmission speed is bounded by the speed of light, even on a perfect network with infinite bandwidth. Long-distance indicates latency.
The ideal case for large datasets is to create a model where the data already exists so that mass data transmission can be avoided. Several databases support it to a limited extent.
ETL (export, transform, and load) and ELT (export, load, and transform) are two common data pipeline configurations in the database world. Machine learning and deep learning increases the need for these, especially the transform part. ELT provides more flexibility when your transformations need to alter as the load phase is usually the most time-consuming for big data.
The conventional wisdom is that you should import your data to your desktop for model building. The unlimited data needed to build good machine learning, and deep learning models changes the picture. Although you can download a small sample of data to your desktop for exploratory data analysis and model building, you need to have access to the entire data for production models.
Except for training models, the compute and memory requirements of notebooks are usually minimal. It helps a notebook to spawn training jobs that run on multiple large virtual machine or containers. It also aids the training to access accelerators such as GPUs, TPUs, and FPGAs. These can reduce days of training to hours.
You might not be good at picking machine learning models selecting features and engineering new features from the raw observations. These are time-consuming and can be automated to a large extent. AutoML systems frequently try out many models to see which result in the best objective function values to minimize squared error for regression problems. An ideal AutoML system can also perform feature engineering, and use their resources effectively to pursue the best possible models with the best possible sets of configurations.
The giant cloud platforms offer robust and tuned AI services or solutions for many applications, not just image detection. These include language translation, speech to text, text to speech, forecasting, and recommendations. These services have already been trained and examined on more datasets than is generally available to businesses. These are also installed on service endpoints with sufficient computational resources, including accelerators to confirm good response times under worldwide load.
Last, you require ways to control the costs incurred by the models. Deploying production models frequently accounts for 90% of the value of deep learning. On the other hand, the training accounts for only 10% of the cost.
The best way to control prediction costs lies on your load and the complexity of your model. If the load is high, you might be able to use an accelerator to avoid adding more virtual machine instances. If the load is variable, you might be able to dynamically change the size or number of instances or containers as the load varies up and down. In case of a low or occasional load, you can use a tiny instance with a partial accelerator to handle the predictions.
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.