data150

Future Plans for Predicting CO2 Emissions in Mainland Southeast Asia
by Kyle Chen
Word Count: 2066
Due: May 18, 2021

Problem Statement

Carbon Dioxide emissions are the primary human-generated cause of Climate Change and Global Warming. Climate Change, with the melting of the ice at both the North and South Poles, is endangering island and coastal communities, who may see their homes swept away into the sea. People have adapted by supporting renewable energy sources for local powergrids and homes, as well as adopting vehicles that use electricty in some way. By mapping carbon dioxide output over time in developing countries, we are able to develop methods to reduce the rate at which carbon dioxide emissions increases, and by adding a predictive element to the model, we are able to project future carbon dioxide emissions by 100x100 meter spaces.

Thus, how can we use CO2 detection and prediction algorithms to project future emissions from developing countries onto a map with a high enough resolution to understand how and where CO2 is currently being produced and where it will spread?

Introduction

The development of Southeast Asian transportation infrastructure, including rail- and waterway-based transport, as well as an increase in industry has led to an increase in CO2 emissions in the region. Thus, the spread of major population centers, as well as the development of smaller population groups, should show in the current data describing CO2 output, and predictions of future CO2 emissions should give us insight to how the population in these areas will spread. As transportation infrastructure must develop alongside population centers, CO2 emissions should be a good enough measure to check the growth of transporation infrastructure. As transportation infrastructure develops and populations increase, trade must increase to supply the demand, thus bettering the lives of people living in Southeast Asia and aiding in the human development of this region.

With the rate at which transportation infrastructure in the region is growing, Southeast Asia had some of the highest increases of CO2 emissions within the last 30 years of any developing region on Earth. As such, using Southeast Asia as an initial test for mapping and predicting CO2 emissiosn would be a step towards decreasing the rate of CO2 emissions from transportation and industry. The growth of Southeast Asian infrastructure can be seen primarily after the CO2 emissions spike and GDP growth during the 1990s, and rather into the 2010s, where both waterway and road-based trade have increased in volume in the mainland. By this point, Indonesia, Malaysia, and Singapore already have primarily waterway-based trade economies. However, the increase in GDP per capita within these countries likely saw to the increase in roadway infrastructure, due to the prevalence of the automobile in these areas.

In fact, the shift in exports from China to exports from less developed countries such as Vietnam and Bangladesh have caused a surge in carbon dioxide emissions, while the growth rate of Chinese and Indian emissions have slowed between 2004-2011. Due to the amount of trade within the region doubling within this time period, some predict that this trend will “seriously undermine international efforts to reduce global emissions.” By 2011, both China and India saw a increasing amount of carbon dioxide emissions going out of the country due to the export economies these two countries have. Conversely, Southeast Asia, Papua New Guinea, the Middle East, and South America saw an increase in carbon dioxide emissions going into the country due ot the import economies of these regions. Between 2004 and 2011, export trade from China to South America, Southeast Asia, and the Middle East increased by at least 100%. Chinese CO2 emissions with regards to export have in actuality increased from ~150 million tons (Mt) of CO2/year to ~250 Mt CO2/yr, and Other Asian and Pacific states (OAS) saw a comparitively small increase in emissions for in its import economies of 100%, from ~50 Mt CO2/yr to ~100 Mt CO2/yr (Meng et al. 2018).

Vietnamese GDP growth in particular, which averaged at a fairly sizable 7.2 percent per year between 1993 and 2013, can be attibuted to Vietnam’s economic liberalization of the mid-1980s and their subsequent integration into the global economy. Despite Vietnam’s preference on road-based infrastructure, which accounts for 40% of the national freight in tons per kilometer and is the recipient of about 80% of public transportation spending, road transport produces the most carbon dioxide of the transportation means. River barge transport is nearly 4 times as fuel efficent than the road-based freight trucking. With both major population and industrial sectors of Vietnam, Hanoi and HChi Minh City, being situated on the Red River and Mekong River Deltas respectively, water-based transportation of freight, inland waterway transport (IWT) freight accounts for 48.3% of the tons transported in-country, compared to road freight’s 45.4% share of tons transported in-country. However, road-based freight transport has been growing much faster than that of IWT, and accounts for 6.2% more tons of freight per kilometer transported at 36.6%, compared to IWT’s 30.2%, as the road-based transport on average transports freight 31 more kilometers than IWT at an average of 143 kilometers. The size of the Vietnamese river-going cargo vessels has increased from 33,859 vessels in 2000 to 95,126 vessels in 2010, of which 50% are smaller 5-15 ton vessels, with a growing number of larger vessels of the 300 ton and above classes. IWT freight transport is projected to increase from 200,000 tons per day in 2008 to 300,000 tons per day in 2030. and from Ho Chi Minh City and Hanoi Coastal shipping, according to data from the Vietnam Maritime Administration, also grew at an annual rate of 13.2% between 1998 and 2008. Overall tonnage that passed through Vietnam’s seaports increased from around 40 million tons in 1995 to nearly 195 million in 2008, with ~42% of all tonnage being domestic freight transport and another 32% being export freight transport. These developments are especially important when taking into account emissions from each transportation type. With regards to the upper limits of the logarithmic scale detailing CO2 emissions in grams per ton-kilometer (g/ton-km), as shown in the figure below, road-based infrastructure is ~253% more CO2 intensive than vontainer vessels whose maximum carrying capacity is less than 2000 20-foot equivalent units (TEUs) and 455% more CO2 intensive than container ships of carrying capacities between 2000 TEUs and 8000 TEUs (Blancas & El-Hifnawi, 2014). As the primary trade regions of the area are in-fact the cities, by understanding how the cities grow, we can also attempt to map CO2 emissions to said growth.

Therefore, having both a machine-learning algorithm for CO2 emissions prediction per city and a machine learning algorithm for city neighborhood growth prediction using a population-per-pixel to CO2 emissions model would allow us to properly project predictions to developing cities, and by using the Zipf-to-Pareto constant correlation function we can not only predict the emissions of another city of a different population given a single city’s size and current CO2 footprint, but also the increase in CO2 emissions per new city.

Objectives

The objective of this study is to develop a pair of algorithms to predict both the amount of emissions of CO2 per person and the spread of population in cities, and whether the model projects outward growth or increasing density, and to then project the results of the former onto the latter by using a population-per-pixel to CO2 emissions model, as shown by Gaughan et al. in order to understand the growth of developing cities and how to combat increased carbon output in these areas.

Research Planning and Budgeting

It would be preferable to utilize the model developed by Gaughan et al. over the model developed by Lin et al., as the the “LMDz-INCA” model developed by the Laboratoire de Météorologie Dynamique and used by Lin et al., using datasets from the World Data Center for Greenhouse Gases, still need to be calibrated properly for the detection and estimation of greenhouse gas emissions. Using this model to “predict” the previously observed data of both methane and carbon dioxide emissions from 2006-2013 in South and East Asia using two models: a standard version of the LMDz-INCA model and a zoomed version with a higher resolution centered over India and China. The study used data with a horizontal resolution of 50 km, as to “simulate the variations of CH4 and CO2 during the period 2006–2013.” The model-projected data for South-East Asia, primarly Indonesia, and Southern East Asia, whose data collection centers are near Taiwan and Hong Kong, was very similar to that of observed data from the time period for methane emissions, which are primarily from anthropogenic sources, which are left unsaid by the study. However, the model was inconsistent with prediction of seasonal carbon dioxide emissions in parts per million, with a lower Pearson correlation value, with R = 0.27 for the standard model and R = 0.30 for the specialized model. This trend can also be seen for the other three data sets from the region. In both Northern-Southeast Asia and Southern Southeast-Asia, while the two prediction models are very similar, there exists a large standard deviation with regards to the observed data (Lin et al., 2018). Thus, while greenhouse gas emissions may be used for population distribution prediction related to transporation and industry, there exists too broad a set of data in the carbon dioxide prediction gradient in parts per million.

The LMDz-INCA model also does not have any provisions for detailed projections of CO2 emissions onto maps of regions. As such, we will primarily focus on utilizing both the nighttime lights and population per pixel models, as detailed by Gaughan et al. in order to obtain greater detail with regards to local CO2 emissions. However, the algorithms for the methods used will take time to develop. As such, alongside the data collection phase of the project, an initial year 1 budget of $60,000 in order to travel, which will cost $20,000, obtain data from both local government institutions and non-profit organizations, with $5,000 allotted assuming we need to obtain more data on our own, and develop both algorithms is necessary, as well as having two computers, one worth between $5,000-6,000 connected to the larger cloud network in order to facilitate easier access and storing of data and one worth between $2,000-3,000 as a personal workstation for the development of the algorithms and other work related to data analysis. Another $3,000 for documentation of work has been allotted.

I also propose the use of the Zipf-to-Pareto principle constant correlation function in order to understand the relationship between the current number of cities and overall CO2 emissions output. The inverse-relationship between the Pareto principle and Zipf’s law will develop further the relationship between carbon dioxide output and city number and size to allow for estimation of national CO2 output based on a single city’s population. Our proposed result, alongside our model, is enough as a basis for future CO2 distribution prediction modelling.

Possible Difficulties and Solutions

However, I may encounter some difficulty regarding existing machine-learning algorithms for this purpose, as they may contain certain bugs such that accurate data collection may not be strictly possible. Similarly, with regards to inital data provided, there may be data that currently does not exist, such that prediction of modern CO2 emissions may have to be based on data from multiple years before, accumulated up to a date several years before our current data.

In this case, Markhov Chain and Recurrent Neural-Network technologies can be used to fill-in the spaces created by lost data. However, accuracy may be lost when testing and predicting future data. The Python programming language will be used for data processing and visualization of data.

We will have multiple separate datasets for training and testing data. The training set may just consist of more general CO2 emissions data in the past, and won’t strictly be from the region. The testing data will be more recent CO2 emissions data from the region we are currently studying, in this case Southeast Asia. The size of these sets are flexible, and investigations into the accuracy of the output given the training and testing input sets will be done.

Similar methods may instead replace the currently chose model of CO2 projection. Those methods, too, may include changes to the linear regression function within the model to better fit the training and testing sets of data. Nonlinear models, primarily the polynomial models, will also be considered for the data, and will be constantly compared with the original linear regression method in order to understand the general increase in CO2 emissions.

References

Blancas, L. C., & El-Hifnawi, M. B. (2014). Facilitating Trade through Competitive, Low-Carbon Transport: The Case for Vietnam’s Inland and Coastal Waterways. Directions in Development: Countries and Regions.

Gaughan, Andrea E, Oda, Tomohiro, Sorichetta, Alessandro, Stevens, Forrest R, Bondarenko, Maksym, Bun, Rostyslav, . . . Nghiem, Son V. (2019). Evaluating nighttime lights and population distribution as proxies for mapping anthropogenic CO2 emission in Vietnam, Cambodia and Laos. Environmental Research Communications, 1(9), Environmental Research Communications, 2019-09-11, Vol.1 (9).

Lin, Xin, Ciais, Philippe, Bousquet, Philippe, Ramonet, Michel, Yin, Yi, Balkanski, Yves, . . . Zhou, Lingxi. (2018). Simulating CH 4 and CO 2 over South and East Asia using the zoomed chemistry transport model LMDz-INCA. Atmospheric Chemistry and Physics, 18(13), 9475-9497.

Meng, Jing, Mi, Zhifu, Guan, Dabo, Li, Jiashuo, Tao, Shu, Li, Yuan, . . . Davis, Steven J. (2018). The rise of South-South trade and its effect on global CO2 emissions. Nature Communications, 9(1), 1-7.