- Developing and optimizing a large neural network can emit as much as 284,000 kilograms of carbon dioxide.
- This is equivalent to 5 times the lifetime emission of an average car.
Recent advances in the field of artificial intelligence (AI) have ushered in a new era of large networks trained on massive data. These networks have drastic accuracy improvements across several basic natural language processing (NLP) tasks.
The most resource-hungry models, in particular, have achieved the highest scores. However, training such models requires a huge amount of computational resources, which demand significant energy.
Recently, researchers at the University of Massachusetts Amherst published a paper in which they described carbon dioxide emissions from AI models, by performing a life cycle assessment for training big neural networks.
A decade ago, NLP models could be developed and trained on a conventional server or laptop, but that’s not the case anymore. Today, highly accurate models require multiple instances of TPUs (tensor processing units) or GPUs. Research and experiments with model architectures and hyperparameters have further raised hardware costs.
Powering such hardware for weeks leaves a substantial impact on the environment. Although a part of this energy comes from renewable sources, it is limited to the technology we currently have to generate and store it. In fact, most locations do have enough facility to derive energy from carbon-neutral sources.
Carbon Emission From Training NLP models
In this study, researchers characterized the carbon emissions and cost that result from training large neural networks. They estimated how many kilowatts of energy it takes to develop and tune popular NLP models. They then converted it into approximate electricity costs and carbon emissions.
Estimated CO2 emission from NLP models vs other familiar consumption
The findings show that developing and optimizing a large NLP pipeline could emit 284,000 kilograms of carbon dioxide, which is equivalent to 5 times the lifetime emission of an average car (including its manufacturing process).
Both financial and environmental costs increase proportionally to the size of AI models. But once you add tuning functions to further improve the model’s accuracy, the associated costs explode.
More specifically, the tuning function (also called neural architecture search) — iteratively tweaks the network’s design through intensive trial and error — leads to extremely high costs for little performance improvements.
Considering the ongoing trends in the AI field, the significance of this research is huge. Many AI research facility neglect efficiency as networks models trained on abundant data have been found to be useful in various tasks.
While computationally efficient algorithms exist, they are hardly used in practice for optimizing NLP models, due to their incompatibilities with most common deep learning frameworks like TensorFlow and PyTorch.
According to researchers, this type of research should be carried out to heighten awareness about the extensive use of resources and promote mindful practice and policy.