Feed SQ | The Economy

Artificial Intelligence: The New Cold War and the Role of Universities

Picture

Member for

8 months 1 week

Real name

Stefan Schneider

Bio

Stefan Schneider brings a dynamic energy to The Economy’s tech desk. With a background in data science, he covers AI, blockchain, and emerging technologies with a skeptical yet open mind. His investigative pieces expose the reality behind tech hype, making him a must-read for business leaders navigating the digital landscape.

Input

February 11, 2025

Changed

March 18, 2025

The launch of Sputnik 1 by the Soviet Union in 1957 sent shivers across the United States, calling into question its claimed technological superiority and sparking the Space Race. Jump ahead to January 2025, and a comparable paradigm shift has taken place in the field of artificial intelligence (AI). DeepSeek, a Chinese artificial intelligence business located in Hangzhou, Zhejiang, has introduced its revolutionary AI model, DeepSeek-R1. This model not only competes with but also exceeds the capabilities of the most advanced American AI technologies in several areas. DeepSeek's accomplishment has sent shockwaves through financial markets throughout the world, raising serious issues about who will lead the future of artificial intelligence. This success was achieved for a fraction of the cost that U.S. tech giants such as OpenAI, Microsoft, and Meta have invested.

The Nasdaq Composite Index, which reflects the performance of the top technology companies in the United States, fell by around 3.1% within 48 hours of the unveiling of DeepSeek-R1. Nvidia, a prominent player in AI hardware, had its stock collapse by roughly 17%, equating into a catastrophic loss of $589 billion in market value. The consequences went beyond the financial statements of companies; legislators, industry executives, and academics were compelled to reconsider the United States' approach for artificial intelligence.

The growing technical competition between the United States and China has resulted in increased scrutiny of applications produced in China, with TikTok being a central point of focus in this "AI Cold War." U.S. lawmakers are thinking about taking strict action against the popular social media site because of worries about data privacy, national security, and the possibility of foreign influence.

Source: *https://easy-peasy.ai/ai-image-generator/images/cold-war-geopolitical-symbolism-strategic-divide-visualized*

Officials in the United States are worried that ByteDance, the firm that owns TikTok, would be forced to give the Chinese government user data because of Chinese law. This has raised concerns that TikTok could be used for spying or spreading propaganda, which would endanger national security. As a result, legislation has been passed that requires ByteDance to sell TikTok within a certain period of time or risk being banned in the United States.

The possible prohibition of TikTok in the United States brings up important considerations about the future of an internet that is open and linked. The United States has a long history of supporting internet freedom and the free exchange of information. On the other hand, the current attitude toward TikTok indicates a change in perspective that involves looking at digital platforms from a geopolitical point of view, which could result in further digital fragmentation.

Similar to the Cold War era's war for space supremacy, the current race for artificial intelligence is not just a competition between the United States and China. Rather, it has developed into a multilateral competition in which other countries contribute their own distinct advantages to the arena. By 2030, China wants to be the leader in the artificial intelligence business. The country has a lot of state-backed financing, large data ecosystems, and powerful AI companies such as DeepSeek, Baidu, Tencent, and Alibaba. Government policies encourage innovation powered by artificial intelligence, combining the ambition of the state with the energy of entrepreneurs. Europe places a high priority on ethical artificial intelligence (AI), data privacy, and human-centered design. It is leading projects such as the Confederation of Laboratories for AI Research in Europe (CLAIRE) and the European Union's AI Act, which define global standards for ethical AI governance. Emerging economies, such as India, Brazil, South Africa, and Southeast Asia, use artificial intelligence to tackle urgent social issues in healthcare, agriculture, and financial inclusion. Programs such as "AI for All" in India highlight the potential of artificial intelligence to promote fair growth. The United States is experiencing increasing competition, even though it has a long history of being a leader in artificial intelligence. Although there is a solid foundation for progress due to world-class universities, innovation in the private sector, and government-supported research efforts, bureaucratic inefficiency and fragmented regulations are obstacles to quick improvement.

The changing environment of artificial intelligence demonstrates the democratization of innovation, where leadership is no longer determined exclusively by economic might but rather by agility, collaboration across disciplines, and strategic vision. American institutions were essential to the Space Race because they sped up research, increased STEM education, and encouraged innovation. In the same way, higher education institutions must adjust to new difficulties in an AI Cold War in order to make sure that the United States stays at the forefront of technological advancement.

It is possible that traditional university systems, which have isolated disciplines and lengthy research cycles, are no longer adequate. The current AI revolution requires an interdisciplinary strategy that combines topics like as cognitive science, ethics, policy, and engineering. Universities should focus on interdisciplinary AI programs that connect machine learning, law, philosophy, and public policy; speed up innovation cycles and move away from traditional academic bureaucracy to support real-world applications; and create collaborative AI ecosystems that encourage partnerships between academia, industry, and government agencies.

Note: Looming US and China Techwar. / Source: TODAY/Raymond Limantara

Artificial intelligence research is most successful in an environment that is open and collaborative. Even if geopolitical issues may restrict direct relationships between the United States and China, universities should actively pursue alliances with institutions in Europe, India, and Latin America in order to develop a variety of perspectives on artificial intelligence and networks for exchanging knowledge.

As AI systems become more powerful, ethical problems concerning prejudice, data privacy, and misinformation grow. Universities should take the initiative in developing ethical AI frameworks by incorporating AI ethics into their core curricula, creating research initiatives that focus on fairness, transparency, and accountability in AI, and involving policymakers in the creation of responsible AI regulations that strike a balance between innovation and security.

The Chinese government is supporting educational reforms that are quickly creating a highly competent workforce in artificial intelligence. The United States has to respond to this by increasing financing for artificial intelligence education at all levels, extending outreach programs in science, technology, engineering, and mathematics (STEM) to diversify the talent pool for artificial intelligence, and making immigration regulations more favorable to recruit the best artificial intelligence talent from across the world to American universities.

Note: Statistical graph visualizing the US-China AI Cold War

The artificial intelligence The Cold War is not just about who has the most advanced technology; it also signifies a major change in the way power is distributed around the world. The countries that are at the forefront of artificial intelligence will determine the future of industries, military policy, and economic growth. In order for the United States to maintain its position of leadership, universities must adopt agility, interdisciplinary collaboration, and international cooperation. Institutions of higher education cannot afford to operate in isolation; they must actively contribute to creating the future of artificial intelligence through innovative research, ethical issues, and workforce development.

America's reaction to Sputnik in the 20th century set the stage for its leadership in science and technology for many years to come. In the 21st century, the answer to DeepSeek and the larger AI Cold War must be as transformational. The competition is on, and the importance of universities has never been greater.

Picture

Member for

8 months 1 week

Real name

Stefan Schneider

Bio

EU Regulations, Obstacles, and Prospects for Organic Agriculture

Picture

Member for

8 months 1 week

Real name

Nathan O’Leary

Bio

Nathan O’Leary is the backbone of The Economy’s editorial team, bringing a wealth of experience in financial and business journalism. A former Wall Street analyst turned investigative reporter, Nathan has a knack for breaking down complex economic trends into compelling narratives. With his meticulous eye for detail and relentless pursuit of accuracy, he ensures the publication maintains its credibility in an era of misinformation.

Input

February 11, 2025

Changed

March 18, 2025

Thanks to the European Union's (EU) strong support, organic farming is expanding quickly throughout Europe. Regulations governing the operations of organic farms are largely defined by the EU. EU regulations affect every aspect of the organic industry, from determining what can be sold as organic to determining which farming methods are permitted. Are these policies effective, though? And what obstacles still exist?

One of the most significant pieces of legislation pertaining to organic farming is the EU Organic Regulation (EU) 2018/848. It establishes the substances that can be used in organic food, which goods can be branded as organic, and how farmers can obtain the EU organic badge. The planned New Genomic Techniques (NGTs) law is another significant policy reform. Genetic modification regulations, food labeling, and organic farmers' capacity to select non-GMO crops may all be impacted by this. The organic business environment is still being shaped by additional governmental ideas, such as stronger green claims regulations and new sustainable labeling guidelines.

Source: https://kids.earth.org/climate-change/organic-farming/

As part of the Green Deal's Farm to Fork initiative, the EU aims to have at least 25% of its agriculture be organic by 2030. Only 10.5% of EU agriculture was organic in 2022, up from 5.9% in 2012. The growth rate of organic farming must accelerate to meet the 25% target. Organic farming is currently expanding at a rate of roughly 6% annually. This must rise to about 10% annually. Thus, until 2030, an additional 2.9 million hectares must be converted to organic farming each year. Through the Common Agricultural Policy (CAP) for 2023–2027 and the EU Action Plan for Organic Production, the EU is attempting to promote organic farming. Farmers receive financial assistance from CAP, and national plans want to raise the percentage of organic farming to 10% by 2027. Reports, however, indicate that these initiatives might not be sufficient to meet the 25% goal by 2030.

In 2024, the European Court of Auditors (ECA) published a study evaluating the effectiveness of EU funding for organic farming. Since 2014, the EU has invested over €12 billion to promote organic farming. But according to the research, organic production has not increased much as a result of the funding. The ECA report's main conclusions emphasize the absence of adequate tracking and monitoring methods, ineffective cash allocation, and the sluggish expansion of organic farming. These results imply that although the EU is investing much on organic farming, the results are not as significant as anticipated. Improvements must be made if the EU is to meet its organic farming targets. To successfully resolve issues, experts advise better funding distribution, enhanced monitoring systems, and increased interaction with farmers and others.

Although organic farming has numerous advantages, there are drawbacks as well. Because organic farming does not use synthetic fertilizers and pesticides, the soil and water are cleaner, which reduces pollution. Additionally, because organic farms don't use dangerous pesticides, pollinators like bees and butterflies do better there. Additionally, by encouraging a more diverse range of plants, animals, and insects, organic farming contributes to biodiversity. Over time, it increases soil fertility, lowers the chance of soil erosion and degradation, and eventually makes land more sustainable. By encouraging natural soil enrichment processes, carbon absorption, and lower energy consumption than intense conventional farming methods, organic farming also lowers greenhouse gas emissions.

Organic farming does, however, come with challenges. Organic food might be more costly than conventional food due to lower yields. Manual operations like weeding and pest control demand more manpower, which raises the cost of manufacturing. Since organic farmers do not utilize genetically modified organisms (GMOs), their crops are more susceptible to pests and climate change, which presents another challenge: limited crop variety. In organic farming, seasonal constraints can affect the year-round supply of food. To successfully switch to organic practices, farmers require more technical training and knowledge, which may make it difficult for new organic farmers to get started.

Alternative farming methods like hydroponics and vertical farming are growing in popularity due to the drawbacks of organic farming. With hydroponics, plants are grown in nutrient-rich water rather than soil. This method enables farming in urban environments, saves water, and boosts yields. Another creative approach is vertical farming, which uses little area to grow crops in stacked layers. It minimizes land consumption and is perfect for urban farming. These techniques can enhance productivity and increase the sustainability of food production, which can support organic farming. Another sustainable practice is agroforestry, which involves growing crops and trees together to increase soil fertility, biodiversity, and carbon sequestration while giving farmers a variety of revenue streams.

Organic farming has benefited from EU funding, but there are still issues. The EU's organic agricultural programs can be strengthened with better finance plans, enhanced oversight, and increased farmer involvement. The industry can be strengthened by increasing consumer awareness of the advantages of eating organic food, developing organic market infrastructure, and growing financial assistance programs for small-scale farmers. Growing cooperation among EU policymakers, organic farming associations, and academic research institutes will foster the development of creative solutions to raise the sustainability and efficiency of organic farming. A more sustainable food system can be achieved in large part through organic farming, but in order to satisfy the increasing demand for sustainable food production, organic farming must develop alongside cutting-edge farming methods. Europe will advance toward a more resilient and ecologically friendly agricultural future by combining traditional organic farming with contemporary sustainable technologies.

Note: Organic Farmland in Europe / Source: *Landgeist*. https://landgeist.com/2022/09/03/organic-farming/

The market for organic foods is significantly shaped by consumer behavior in addition to EU regulations. Organic products are becoming more and more popular as people become more conscious of environmental and health issues. Price is still a major obstacle, though. Because organic food is frequently more expensive than its conventionally cultivated counterparts, people with lesser incomes may find it more difficult to obtain. Businesses and governments can collaborate to lower the cost of organic food by using direct farm-to-consumer methods, price incentives, and subsidies. To encourage more farmers to embrace sustainable practices and boost the total availability of organic food, another tactic is to raise the knowledge surrounding organic farming.

Furthermore, organic producers are facing new difficulties because of climate change. Crop yields are under threat from rising temperatures, harsh weather, and erratic growth conditions. Organic farmers must use climate-smart practices, such as drought-tolerant crops, better irrigation systems, and soil conservation measures, to reduce these risks. For organic farming to remain sustainable over the long run, research and development will be essential.

Organic farming may continue to grow in the EU and around the world with the correct mix of market incentives, consumer education, regulatory backing, and technology breakthroughs. The organic industry has the power to revolutionize European agriculture and build a more robust, ecologically responsible, and health-promoting food system. Europe can take the lead in establishing organic farming as a popular and sustainable agricultural model for the future by tackling current issues and seizing new opportunities.

Picture

Member for

8 months 1 week

Real name

Nathan O’Leary

Bio

Trump says he will announce raft of new trade tariffs

Picture

Member for

8 months 1 week

Real name

Joshua Gallagher

Bio

A seasoned journalist with over four decades of experience, Joshua Gallagher has seen the media industry evolve from print to digital firsthand. As Chief Editor of The Economy, he ensures every story meets the highest journalistic standards. Known for his sharp editorial instincts and no-nonsense approach, he has covered everything from economic recessions to corporate scandals. His deep-rooted commitment to investigative journalism continues to shape the next generation of reporters.

Input

February 11, 2025

Changed

March 18, 2025

President Trump has announced a major policy shift, unveiling plans to impose 25% tariffs on all steel and aluminum imports into the U.S. This series of trade measures would significantly affect global markets, particularly targeting steel and aluminum imports. He revealed plans to impose a 25% tariff on all steel and aluminum entering the U.S., with the most significant impact on its largest trading partners, including Canada and Mexico. Trump also signaled future actions involving reciprocal tariffs, meaning countries that impose tariffs on U.S. products could face U.S. tariffs matching their own rates. Although he did not specify which countries would be affected, the European Union and other major trading partners, like Australia, voiced concerns.This decision aligns with his broader trade agenda, which aims to prioritize U.S. manufacturing, strengthen the economy, and address what he sees as unfair trade practices. Speaking from Air Force One, Trump emphasized that these tariffs would target all nations, signaling the U.S. will retaliate against countries that impose tariffs on American goods. This is part of Trump's "reciprocal tariffs" policy, where the U.S. will charge back any country imposing duties on U.S. products.

Note: Photo of President Trump depicting policy decisions. / Source: https://www.newstatesman.com/politics/2017/01/how-much-power-does-donald-trump-really-have-trade

Speaking aboard Air Force One en route to New Orleans for the 2025 Super Bowl, he also revealed plans to introduce "reciprocal tariffs" on Tuesday or Wednesday, which will take effect immediately. This means the U.S. will impose tariffs on goods from countries that have imposed duties on U.S. products. Trump stated, "If they charge us, we charge them … every country," and emphasized that the U.S. would no longer accept unfair trade practices, such as paying 130% in tariffs while other nations face no tariffs.

The European Commission stated that it had not received formal notification of the tariffs, but French foreign minister Jean-Noel Barrot vowed that the EU would respond in kind, saying there would be no hesitation in defending European interests. The European Union has indicated it will respond strongly if these tariffs are enacted, with French foreign minister Jean-Noel Barrot expressing that the EU will "replicate" any new tariffs. Germany, too, warned of the negative consequences of prolonged tariff conflicts, with Robert Habeck asserting that they would unite in defending European interests. The EU has long been critical of Trump's trade policies, and its response suggests a continued commitment to challenging tariffs they deem unjust. Germany echoed this stance, with Robert Habeck asserting that Europe would unite in its response, noting that prolonged tariff conflicts ultimately result in losses for all parties involved. During his first term, Trump imposed 25% tariffs on steel and 10% on aluminum, later granting several trading partners exemptions. Earlier this month, he announced new tariffs targeting the U.S.'s closest trading partners—Mexico, Canada, and China. Last week, he agreed to delay the 25% tariffs on imports from Mexico and Canada for 30 days, along with additional tariffs on Canadian oil, natural gas, and electricity. Trump, a strong advocate of tariffs, views them as crucial for gaining cooperation from countries to address illegal immigration and fentanyl smuggling. Additionally, he believes tariffs will help boost U.S. manufacturing and generate revenue for the federal government.

Note: EU and US demonstrating its economic ties. / Source: https://worldview.stratfor.com/article/can-eu-avoid-trumps-tariffs-importing-more-us-oil-and-gas

The European Union, in particular, argued that there was no justification for the proposed tariffs and promised to protect European businesses and consumers. Canada, a major supplier of steel and aluminum to the U.S., was expected to seek an exemption, as it had during Trump’s first term. The tariffs, which are part of Trump's broader economic strategy to boost U.S. industry and jobs, have already caused market volatility, with shares of steel and automotive companies, particularly from South Korea, falling sharply.

As part of his trade policy, Trump has threatened additional duties on imports from China, leading to retaliatory tariffs by Beijing on U.S. goods such as coal, oil, and agricultural machinery. These tensions are further complicated by the U.S.'s ongoing trade dispute with the European Union over auto tariffs, where Trump has previously suggested that the UK could avoid new levies. His broader plan to implement reciprocal tariffs is designed to ensure fairness, he argued, by levying equivalent tariffs on countries that charge the U.S. higher import duties.

The new tariffs build on Trump's first term when he implemented similar measures targeting steel and aluminum imports, although some trading partners were granted exemptions later on. More recently, Trump has expanded his tariff strategy to include imports from Mexico, Canada, and China. He has previously delayed some tariffs on Canadian and Mexican goods, citing security concerns and negotiations around border security, such as preventing illegal immigration and drug smuggling.

In response to Trump's tariffs, China imposed its own retaliatory measures, including new tariffs and an antitrust investigation into Google. China has been careful in its retaliatory actions, opting for measured responses that target specific U.S. industries while avoiding drastic actions that could harm its own economy. Meanwhile, global markets have been reacting to the escalating trade tensions, with commodities like gold hitting record highs.

Furthermore, Trump’s remarks about the Gulf of Mexico—renaming it “Gulf of America”—added to the international controversy. Critics, including the Mexican government, argue that the U.S. has no right to alter the name under international law. Additionally, Trump’s comments about potentially acquiring Canadian and Palestinian territories sparked further diplomatic fallout, illustrating his tendency to blur the lines between policy, rhetoric, and provocation.

The U.S. tariffs have been criticized for raising costs for U.S. industries that rely on imported steel and aluminum, like automotive manufacturers and beverage producers, which are expected to pass these costs onto consumers. Despite this, Trump remains firm in his belief that these tariffs are essential for the U.S. to protect its national security interests and economic competitiveness.

Note: Trump's proposed tariffs on steel and aluminum imports for key trading partners.

Overall, Trump’s trade measures are part of a larger economic agenda aimed at reshaping global trade relationships, but they have provoked significant backlash from key allies, while also contributing to global market instability and fears of an escalating trade war.

Tariffs are central to Trump's economic vision, as he believes they will foster domestic industry growth, create jobs, and generate government revenue. Trump has regularly stated that these tariffs are necessary tools to secure better trade deals and strengthen U.S. interests on the global stage. However, critics argue that these tariffs could lead to trade wars, harming industries that rely on international supply chains, as well as U.S. consumers who may face higher prices.

This move also comes amid growing tensions with major trade partners, particularly China, which has already imposed its own tariffs on U.S. goods, contributing to an ongoing trade war. As Trump's tariffs escalate, the situation has prompted concerns about global economic stability, with the potential for further retaliatory actions from affected countries. The U.S. government is navigating a complex global trade environment where these tariffs could either help reinforce its goals or lead to significant economic backlash.

Picture

Member for

8 months 1 week

Real name

Joshua Gallagher

Bio

The Economy

J.P. Morgan: $2 Trillion Stablecoin Growth Forecast Is Overblown July 25, 2025

July 25, 2025

Ahead of FOMC, Trump Visits Fed and Publicly Pressures Powell to Cut Rates

July 25, 2025

“Koreans Probably Let Out a Few Expletives”—Commerce Secretary's Remarks Reveal Trade Strategy

July 25, 2025

The EU-China Summit: Posturing, Pragmatism, and an Unresolved Future

July 25, 2025

The Silicon Valley–China AI Culture Clash: Work, Innovation, and Ideological Tradeoffs

Intel in Crisis: Layoffs, Downsizing, and Delays as Chip Giant Struggles to Reboot July 25, 2025

“A Bad Deal for America”: Detroit Three Slam U.S.–Japan Trade Agreement

July 24, 2025

Ripple Teams Up with IMF in Bid to Broaden Global Reach

July 24, 2025

"Scrap the Taxes and Rules": Is the U.S. Using Tariff Wars to Shield Its Big Tech Giants?

July 24, 2025

"No Longer Special": Mounting Concerns Over OpenAI’s Waning Edge

July 24, 2025

EU Moves Toward 15% Tariff Deal to Avert U.S. Tensions — An Uneasy Compromise Returns Amid Trump’s Pressure

Main Banner 01_a

Main Banner 01_b

Main Banner 01_c

Main Banner 01_d

America Retreats Again: U.S. Withdraws from UNESCO Over Ideological Rift

July 24, 2025

Japanese Carmakers Gain U.S. Market Share After Holding the Line on Prices

July 23, 2025

Wall Street Economist El-Erian: "Powell Should Resign to Protect Fed's Independence"

July 23, 2025

AstraZeneca’s $50 Billion U.S. Investment: Capitulating to Trump’s Brand of Tariff Pressure July 23, 2025

U.S. House Passes ‘CBDC Ban Act’; Moves to Strengthen Dollar Hegemony Through Stablecoins

July 22, 2025

Hanwha Ocean Secures U.S. LNG Carrier Order, Setting the Stage for Expansion into the MRO Market Ecosystem

July 22, 2025

China Reshapes Global Shipping Order, Secures Stakes in 129 Ports Worldwide

July 22, 2025

Main Banner 02_a

Main Banner 02_b

Main Banner 02_d

Main Banner 02_c

All News

Financial

Ahead of FOMC, Trump Visits Fed and Publicly Pressures Powell to Cut Rates

July 25, 2025

Wall Street Economist El-Erian: "Powell Should Resign to Protect Fed's Independence"

July 23, 2025

Stablecoins Enter Regulatory Framework, U.S. Financial Market Power Struggle Intensifies

July 21, 2025

Tech

The Silicon Valley–China AI Culture Clash: Work, Innovation, and Ideological Tradeoffs

July 25, 2025

Intel in Crisis: Layoffs, Downsizing, and Delays as Chip Giant Struggles to Reboot

July 25, 2025

“A Bad Deal for America”: Detroit Three Slam U.S.–Japan Trade Agreement

July 24, 2025

Policy

“Koreans Probably Let Out a Few Expletives”—Commerce Secretary's Remarks Reveal Trade Strategy

July 25, 2025

The EU-China Summit: Posturing, Pragmatism, and an Unresolved Future

July 25, 2025

EU Moves Toward 15% Tariff Deal to Avert U.S. Tensions — An Uneasy Compromise Returns Amid Trump’s Pressure

July 24, 2025

Bio&Science

AstraZeneca’s $50 Billion U.S. Investment: Capitulating to Trump’s Brand of Tariff Pressure

July 23, 2025

RFK Jr., Gavi, and the Price of Principle: Global Health in Jeopardy

July 22, 2025

Tibet, which has emerged as a power supply site for China's AI computing center, installs the world's largest hydroelectric turbine

July 4, 2025

Bond Yields Rising, Options Shrinking: Japan’s Fragile Balancing Act

July 21, 2025

"Rapid Progress in Banks’ Entry into Crypto Business" – U.S. Regulators Issue Custody Guidelines for Banks

July 15, 2025

Shaken Dollar Status: ‘Tariff, Tax Cut Bill, Fed Pressure’—Trump Risk Erodes Safe-Haven Appeal

July 11, 2025

"Scrap the Taxes and Rules": Is the U.S. Using Tariff Wars to Shield Its Big Tech Giants?

July 24, 2025

"No Longer Special": Mounting Concerns Over OpenAI’s Waning Edge

July 24, 2025

Japanese Carmakers Gain U.S. Market Share After Holding the Line on Prices

July 23, 2025

America Retreats Again: U.S. Withdraws from UNESCO Over Ideological Rift

July 24, 2025

Hanwha Ocean Secures U.S. LNG Carrier Order, Setting the Stage for Expansion into the MRO Market Ecosystem

July 22, 2025

China Reshapes Global Shipping Order, Secures Stakes in 129 Ports Worldwide

July 22, 2025

Oil behemoth Shell reportedly approached BP about a takeover, fueling BP sale rumors.

June 26, 2025

China Significantly Cuts U.S. Agricultural Imports, Bringing 'Windfall Gains' to South American Farmers

June 25, 2025

China Rapidly Catching Up to the U.S. in the Bio Sector

June 20, 2025

Main Banner 03_a

Main Banner 03_b

All Economy Books

Forecasting Demand for Public Bicycling in Seoul to Optimize Bike Repositioning at Rental Stations

Published

November 11, 2024

Sungsu Han*

* Swiss Institute of Artificial Intelligence, Chaltenbodenstrasse 26, 8834 Schindellegi, Schwyz, Switzerland

Abstract

This study aims to improve operational efficiency by optimizing the relocation of Seoul's public bicycles, “ Seoul Bike.” The highlights of the study include Analyzing the usage patterns of Seoul Bike: We analyzed the bicycle travel patterns between business and residential areas, focusing on rush hour.

Introduced the concept of spatiotemporal equilibrium: We proposed the concept of “equilibrium,” where borrowing and returning are balanced in a daily cycle, to provide a basis for relocation strategies.

Developed a data-driven prediction model: We used a SARIMAX model to predict usage by rental location, taking into account variables such as weather, time of day, and location. Optimized relocation strategy: Introduced the D-Index to identify borrowing stations that needed to be relocated, and clustering using the Louvain algorithm to establish efficient relocation zones.

Visualization tool development: We created a visualization tool to intuitively understand bicycle movement patterns and relocation strategies by time of day. This study suggests ways to increase the efficiency of public bicycle operations and improve user satisfaction through a data-driven scientific approach. The results of this study are expected to help Seoul Bike, Seoul reduce operating costs and improve service quality.

This paper will be organized as follows.
For the sake of readability, I'm going to divide this article which may seem long and complicated with multiple stories into 5 section.

In the first section, I'll talk about why I became interested in Seoul Bike and the question “Where are the best places to ride Seoul Bike in Seoul?”. In section 2, I would like to talk about the environmental benefits of public bicycle rental in Seoul and what are the biggest problems with this hot item that can solve the health problems of Seoul office workers.

Through this, I will talk about what causes the public bicycle business to always be in the red and analyze the operating costs of the public bicycle industry to lay the groundwork for my thesis topic. In the third section, I will analyze the usage patterns of Seoul Bike from the perspective of users of Seoul Bike in Seoul to lay the groundwork for the core concept of this thesis, equilibrium, which will be discussed in section 4.

In the fourth section, I will discuss specific ideas on how to forecast the usage of each Seoul Bike rental center to solve the deficit of Seoul Bike in Seoul. Finally, we will talk about additional ideas that can maximize the utility of the previous ideas and describe the datasets that can be applied to them.

We'll also talk about the implications of Seoul Bike's work on the Han River bike path and conclude with a discussion of the implications for the future.

Keywords: Seoul Bike, User Patterns, Rental Stations

PDF View PDF Download

1. Introduction

1.1. Background

I am a researcher who commutes from Goyang City to my office near Magok Naru Station in Seoul. One day on my way to work, I rubbed my sleepy eyes as I stepped off the shuttle and was surprised to see hundreds of green bicycles.

I usually commute to work on the company shuttle, but I've recently gotten into cycling, and when the weather is nice, I even bike from home to work. The biggest reason I got into cycling is because of the positive image of Seoul Bike, the city's public bicycle program.

The question “Where did all those hundreds of bicycles at Magok intersection on my way to work come from?” [figure 1-1-1] stuck in my mind for a while, and I realized that I could combine my interest in bicycles with my thesis on public bicycle projects.

1.2. Problem Statement

Seoul Bike, the city's public bicycle program, has grown rapidly since its introduction in 2015, with great support from Seoul citizens. As of 2023, more than 43,000 bicycles were in operation at more than 2,700 rental stations, and the cumulative number of subscribers exceeded 4 million, making it one of the city's most successful policies. This has had a variety of positive effects, including reducing traffic congestion, addressing air pollution, and improving the health of citizens.

But with this rapid growth has come a number of challenges. The city of Seoul runs a deficit of around 10 billion won every year. This is due to low user fees and high operating costs due to the nature of the public service.

There is a shortage or surplus of bicycles in certain time zones and neighborhoods. This causes inconvenience to users and reduces service efficiency. As of 2019, Seoul Bike had only 60 maintenance staff to manage 25,000 bikes, resulting in an increase in broken bikes and poor management of rental stations. Irresponsible use, vandalism, and loss of bikes are causing additional management costs.

1.3. Aims and Objectives

To address these issues, this study aims to develop a demand prediction model for each Seoul Bike rental station: Develop a model that accurately predicts the demand for each bike rental station by time of day and day of the week by utilizing time series analysis and machine learning algorithms. Based on the predicted demand, we optimize the number of bikes at each rental location to solve the problem of demand-supply imbalance.

Reduce unnecessary relocation costs, increase user satisfaction, and improve overall operational efficiency through accurate demand forecasting and optimized deployment. The developed model provides data-driven decision support for critical operational decisions, such as selecting new rental station locations and deciding to purchase additional bikes.

1.4. Summary of Contributions and Achievements

This study aims to improve operational efficiency by optimizing the relocation of Seoul's public bicycles, “Seoul Bike.” The highlights of the study include analyzing the usage patterns of Seoul Bike: We analyzed the bicycle travel patterns between business and residential areas, focusing on rush hour.

Introduced the concept of spatiotemporal equilibrium: We proposed the concept of “equilibrium,” where borrowing and returning are balanced in a daily cycle, to provide a basis for relocation strategies.

Developed a data-driven prediction model: We used a SARIMAX model to predict usage by rental location, taking into account variables such as weather, time of day, and location. Optimized relocation strategy: Introduced the D-Index to identify borrowing stations that needed to be relocated, and clustering using the Louvain algorithm to establish efficient relocation zones.

Visualization tool development: We created a visualization tool to intuitively understand bicycle movement patterns and relocation strategies by time of day.

This study suggests ways to increase the efficiency of public bicycle operations and improve user satisfaction through a data-driven scientific approach. The results of this study are expected to help Seoul Bike, Seoul reduce operating costs and improve service quality.

2. Solution Approach

2.1. Best Place to Ride a Seoul Bike

According to the Seoul 2022 Transport Usage Statistics Report, Gangseo-gu is the most populated bike traffic district in Seoul, and within Gangseo-gu, seven locations near the Magok business district ranked among the top.

This is probably due to the nature of the Magok business district. Magok is a large, recently developed business complex with a high concentration of office workers. As a new development, the area is well equipped with infrastructure such as cycle paths. Its proximity to the Han River bike path makes it convenient for both commuting and leisure use.

Bicycle commuting is becoming increasingly popular due to growing environmental and health concerns. On the other hand, Goyang City's public bike "Fifteen" was closed due to losses, showing that it is not easy to ensure profitability in the public bike business. In the case of Seoul Bike, the company runs a deficit of more than 10 billion won every year, so it needs to find ways to operate sustainably.

2.2. The Cause of The Public Bike Projects' Deficit

The issue of deficits in the public bicycle business and the analysis of relocation costs is a very important topic.

Reducing the cost of relocating bicycles is likely to be the most effective way to reduce the deficit of the public bicycle project. Analyzing the usage patterns of users in Seoul Bike to optimize the relocation of bicycles will contribute to reducing the deficit.

Based on this analysis, it is important to explore efficient operational measures to increase the sustainability of the public bicycle business. Along with establishing specific strategies to reduce relocation costs, it is likely that a variety of approaches will be needed, including engaging users and diversifying revenue.

[Fig 2-2-1] Deficit status of public bicycle business
Nubiza (Changwo): KRW 4.5 billion, Tashu (Daejeon): KRW 3.6 billion, Tarangae (Gwangju): KRW 1 billion, Eoulling (Sejong): KRW 0.6 billion, Seoul Bike (Seoul): Over KRW 10.3 billion

[Fig 2-2-2] Operating cost structure of Go-yang City case
The cost of relocating bikes is the largest component, accounting for around 30 per cent of total operating costs. Relocation costs of KRW 525 million, about 30% of total maintenance costs of KRW 1.78 billion.
Relocation costs of KRW 525 million: On-site distribution - KRW 375 million, Relocation vehicle operating costs - KRW 150 million

2.3. Time of Day Usage Patterns of Seoul Bike Users

First, let's look at the usage patterns by time of day. On weekdays, the usage is concentrated at 7-8am and 5-6pm, which are commuting hours, suggesting that Seoul Bike is mainly used for commuting. However, on weekends, the usage is relatively evenly distributed from 12 noon to 6 pm, suggesting that it is used for leisure activities or going out. Next, let's look at the user characteristics. In terms of age, users are mainly in their 20s and 30s. This shows that Seoul Bike is becoming a popular means of transport for young people. In addition, the purpose of use is mainly commuting on weekdays and leisure activities on weekends.

[Fig.2-3-1] Seoul Bike Usage Rate by age group

Next, we look at the most important regional clusters. Regional clustering is characterized by a tendency for Seoul Bike to be concentrated in large business districts, areas with well-developed infrastructure such as bike paths, and areas with easy access to the Han River. In particular, it is more popular in areas with easy access to subway stations and workplaces. Representative areas include Magok District in Gangseo-gu, Songpa-gu (Lotte World Tower in Jamsil), Yeongdeungpo-gu (Yeouido business district), and Seongdong-gu .

2.4. Summary

Cycling has become a popular mode of transport for commuting and leisure activities, especially among young people, and tends to be concentrated in large business districts and areas with well-developed cycling infrastructure. These findings suggest that cycling is more than just a mode of transport, but is influencing urban transport systems and lifestyles. Taken together, these findings suggest that cycling policies need to take into account the characteristics of each neighbourhood and focus management and investment on areas with high usage. It is also necessary to consider changes in demand by time of day, with bicycle deployment needing to be adjusted to meet different demands during rush hour and on weekends.

3. Methodology

As we mentioned in Chapter 2, we were able to find out that among the newly developed business districts in Seoul, the business districts where bicycle paths are well-constructed and where access to the Han River bicycle path is easy are the ones where Seoul Bike usage is concentrated.

We will start the analysis by focusing on the super-large business districts that are the cause of the concentration of Seoul Bike usage.

So we have come up with a hypothesis. It is a concept called equilibrium. The idea is that if rentals and returns are equilibrium based on the rental station, there is no need for additional reallocation. To this end, the goal of this paper is to reduce the deficit by improving the efficiency of Seoul Bike reallocation by predicting bicycle rentals and returns.

3.1. Correlation of 24-hour weather data with rental demand

First, in order to confirm the correlation between the Seoul Bike rental data and the weather, pre-processing was performed to integrate Seoul Bike usage information and weather based on the time variable. In addition, in order to identify the phenomenon of rush hour and rush hour, the time was divided as follows.

The preprocessed data below is data from January to December 2022 at the rental office at Exit 2 of Magoknaru Station. From the leftmost column, the time variables are month/day rental time rentals and day of the week, and the weather variables are temperature, wind direction, wind speed, and accumulated precipitation. In addition, Seoul Bike usage information was added to the right column and preprocessed.

[Fig.3-1-1] Seoul Bike User Data + Weather Observation Data

Let's first look at the correlation between the preprocessed data and weather data based on the number of users per minute. First, when looking at the wind speed, the stronger the wind, the more the usage tends to decrease. The highest Seoul Bike usage was at the 3m/s level, which can be felt as a gentle breeze rather than no wind at all.

[Fig.3-1-2] Correlation with Seoul Bike User Data Weather Observation Data

Also, since the accumulated precipitation is inversely proportional to the number of users, it seems to match the general idea that people will not ride bicycles as much as they would use Seoul Bike when it rains. Lastly, the temperature was the highest when people think it is a little cool, around 15~17 degrees, but it was confirmed that the usage was low when it was too cold or too hot.

The correlation with temperature is expected to have seasonality as the Seoul Bike rental volume is correlated with the season. Therefore, we plan to conduct a time series analysis using SARIMAX using external weather variables for the remaining residuals after dividing them into seasonality and trend using STL Decomposition.

3.2. Seoul Bike Daily Return Pattern

In this part, we will analyze the return volume data of Seoul Bike rental stations for 4 years to check the seasonality of the return volume of Seoul Bike rental stations and perform STL Decomposition.

Before conducting a time series analysis to predict the demand for bicycles by Seoul Bike rental station, we drew the daily return volume of Seoul Bike behind Exit 5 of Magoknaru Station from 2019 to 2023. The reason why we did not select Exit 3 of Magoknaru Station, which has the largest number, is because it is a recently built station and there is no data for 19-20 years, so we selected the rental station behind Exit 5 of Magoknaru Station.

We drew a graph of the return and rental during commuting hours of the rental station behind Exit 5 of Magoknaru Station and the cumulative amount, which is the value of the return minus the rental. You can see that the number of bicycles increases as the years go by, and that there is seasonality with the lowest usage in winter and the highest usage in spring and fall.

[Fig.3-2-1] Correlation with Seoul Bike User Data Weather Observation Data

3.3. Modeling of Seoul bike cumulative volume during commuting hours

The number of rentals and returns is symmetrical overall, but you can see that it is biased over time. (There are more returns during commuting hours: returns > rentals)

[Fig.3-3-1] Return and Rent numbers in Go work Time

The cumulative amount representing the number of bicycles in the rental station is expressed as the difference between the number of returns and the number of rentals, and has a trend and seasonality. This means that the use of STL Decomposition, a pre-processing step for time series analysis, is necessary.

The following diagram is a graphical representation of the cumulative amount during go work Time in the business district.

[Fig.3-3-2] Return and Rent numbers in Go work Time

And when you look at the cumulative amount, you can see that the number of returns is a very large proportion compared to the number of rentals, and there is an excess of bicycles.

[Fig.3-3-3] Cumulative number of bicycles during Go work Time

This suggests that the bikes in the business area should be distributed to other areas, leaving only the minimum number of bikes needed in the early morning hours.

As expected from the graphs above, there is a significant concentration of bikes rented in the morning hours from residential districts to business districts.

This could be a reason to have more racks in business areas than in residential districts.

[Fig.3-3-4] Seoul Bike moving Pattern in Go work time

3.4. Bike Demand Forecasting by using SARIMAX Modeling

3.4.1. Return amount in go work time - STL Decomposition results

The return amount of the rental office behind Exit 5 of Magok Naru, which we looked at earlier, was decomposed into Seasonal Trend Loess (Local regression). This is the result of extracting the trend and periodicity included in the time series data through STL Decomposition with the period set to 7 days and using the Multiplicative Option.

[Fig.3-4-1] Return amount of Seoul Bike in Go work Time - STL Decomposition Results

3.4.2. Decomposition(Residual) Modeling - SARIMAX Results

We conducted time series analysis using SARIMAX on the residuals from the STL Decomposition discussed above.

We added weather data and time variables and used them as external variables. When we modeled the residuals from the STL Decomposition using SARIMAX, the results are as follows.

[Fig.3-4-2] Return residual of Seoul bike in Go work Time - SARIMAX Results

When the ADF Test was conducted separately, the results showed that stationary was secured, and when looking at the Ljung Box results after running SARIMAX, the P-Value was greater than 0.05, so there was no autocorrelation, and the Heteroskedasticity test results also showed that there was no heteroscedasticity because the P-Value was greater than 0.05. The results when using 90\% of the 4-year return volume data of the rental office behind Exit 5 of Magoknaru Station during rush hour as Train Data and predicting the remaining 10\% are as follows.

When the residuals with secured normality were predicted using the SARIMAX model with the cumulative precipitation and wind direction humidity return years as external variables, it was confirmed that heteroscedasticity was eliminated, and when the return amount was predicted using this modeling, it was confirmed that the R2 value was approximately 0.73.

[Fig.3-4-3] SARIMAX Modeling Block Diagram

3.5. Logic for improving the efficiency of bicycle relocation

[Fig.3-5-1] Idea based on the Seoul Bike Rental Station

So, the idea that we have thought about based on the Seoul Bike rental stations we have seen so far is the concept of equilibrium. If we group a day into 1 cycle based on commuting hours, it is a concept that a state of equilibrium of bicycle increase and decrease comes. If we summarize it, it can be explained as shown in the figure below.

In addition, the non-equilibrium state is a pattern that can be confirmed by limiting it to the commuting and returning hours of the business area. The non-equilibrium state 1 (commuting hours) where there is an excess of bicycles during the morning commuting hours when returns are the main thing, and the non-equilibrium state 2 (commuting hours) where there is a shortage of bicycles during the afternoon commuting hours when rentals are the main thing can be diagrammed as shown below.

Ultimately, the core logic of this paper is to predict the user factor and determine the optimal redistribution factor in order to resolve the temporal Seoul Bike non-equilibrium of commuting and returning hours.

[Fig.3-5-2] Idea based on the Seoul Bike Rental Station

To summarize the content so far, the areas where Seoul Bike reallocation is most necessary are areas with a large imbalance in bicycle rentals and returns in the super-large business districts where the most rentals and returns occur during commuting hours. (Example: Gang-seo LGSP)

Since people who ride Seoul Bike to work are more likely to ride Seoul Bike home, it would be more efficient to place them only in areas where demand is expected to be excessively high in the afternoon and a shortage is expected, rather than evenly reallocating bicycles that were crowded in the morning by utilizing the equilibrium state.

Also, since a shortage is more likely to cause dissatisfaction with Seoul Bike use than an excess, it is important to predict a shortage rather than an excess, which is the summary of the ideas derived so far.

3.6. Seoul Bike cost efficiency solution according to 1-Day Index Range

In order to create a logic for selecting the rental stations to be relocated, the 1 Day Seoul Bike equilibrium state Index can be selected as the ratio of the expected rental volume and the expected return volume, and the equilibrium state and non-equilibrium state can be divided.

[Fig.3-6-1] Logic for selecting the minimum number of rental stations

The wider the range of the Index 1 is selected, the more rental stations are excluded from the relocation target, which reduces the cost. In addition, the non-equilibrium state is divided into excess and shortage states, and the relocation is carried out mainly in the shortage state, so that the Seoul Bike bike relocation is carried out in a way that maximizes customer satisfaction.

In order to verify this logic for selecting the rental stations to be relocated to the Seoul Bike, we first looked at the distribution of the Seoul Bike Index and the cumulative amount. The distribution of the D Index and the cumulative number for June 2023 was expressed as a violin plot and histogram.

[Fig.3-6-2] D_Index & Cumulative Number of Seoul Bike for June 2023

The cumulative number, which is the return amount minus the rental amount, mostly had a ratio of 0, and the D Index, which is the ratio of the return amount to the rental amount, had a ratio of 1.

This can be seen as confirming a certain degree of equilibrium between the rental and return amounts of the Seoul Bike mentioned earlier. Therefore, I visualized the efficiency of Seoul Bike reallocation using the temporal equilibrium state through the D Index mentioned earlier.

The reduction in Seoul Bike reallocation cost using the D Index Range can be flexibly managed by flexibly adjusting the ratio of the Seoul Bike D Index as shown in the figure below.

[Fig.3-6-3] Decrease in relocation costs by utilizing D_Index Range

As we have seen above, efficient operation is possible by increasing the Index Range excluding relocation and reducing the relocation cost, and it is also possible to operate by flexibly adjusting the Index ratio according to the allocated cost of the Seoul City Seoul Bike budget.

The example presented below is visualized data when the Seoul City operation unit is divided using K-Means Clustering and the D Index Range is flexibly operated from 0.1 to 0.4.

This is the result of diagramming the entire Seoul City using the same method as the Gangseo-gu visualization data confirmed earlier.

In the case of the minimum Index Range excluding relocation target rental stations of 0.1, the ratio of excluded rental stations is 13.4\% and the ratio of target rental stations is 86\%, which is a high ratio. This case will be the most expensive and the relocation efficiency is bound to be low.

[Fig.3-6-4] The ratio of rental station count to relocation according to the D Index
Range

In the case where the Index Range for the target rental stations is 0.4, which is the maximum, the ratio of rental stations excluded from the target rental station relocation is 66.6\%, and the ratio of rental stations targeted for relocation is low at 33.4\%.

In this case, the cost of relocation for Seoul bike will be greatly reduced, and the relocation efficiency can also be the highest.

If the ratio of rental stations relocated according to the D Index Range is operated with a focus on practicality, the Index ratio can be flexibly adjusted according to the Seoul Metropolitan Government's Seoul bike budget.

As we have seen above, the ratio of rental stations excluded and rental stations targeted has a trade-off tendency,
and the more the ratio of rental stations excluded using the Seoul bike D Index is increased, the fewer bicycles will be targeted for relocation, which can reduce the relocation cost.

3.7. Seoul Bike operation plan idea using spatial equilibrium

Using the spatial equilibrium examined so far, we have come to think that clustering is necessary to minimize the distance of bicycle redistribution movement by using the distribution of total returns and rentals by district, and clustering that has a partial sum where business districts and residential districts are grouped as a pair.

Since there is a distribution of total returns and rentals by district, we thought about how about managing the redistribution districts by grouping them according to the overall increase and decrease.

Therefore, if the selection of districts according to the increase and decrease by district is selected so that the total daily cumulative number of bicycles within the district is close to 0, there is no need to exchange Seoul Bike between the set districts for redistribution, and the effect of reducing the distance of redistribution bicycles within the district area can be achieved, which leads to improved redistribution efficiency overall because the travel distance of redistribution trucks is reduced.

[Fig.3-7-1] The ratio of rental station bike count to relocation according to the D
Index Range

3.7.1. Clustering application idea for implementing spatial equilibrium of Seoul Bike

To represent the spatial equilibrium as a mathematical vector, we can think of moving from a rental station node to a return station node as a link.

With this idea, we converted Seoul bike rental information into graph data and then came up with the idea of finding a community partition set with the most links.

[Fig.3-7-2] Network analysis idea for spatial equilibrium analysis of Seoul bikes

3.7.2. What is Network Community Detection?

The problem of dividing a graph into several clusters is called Community Detection.

While thinking about the algorithm to apply to the idea I thought of earlier, I thought about using a cluster detection algorithm that can be used for Seoul Bike movement network data.

The basic concept is that ‘groups with high connection density are tied together’, and the algorithm that calls a group that is tied together and more tightly tied (with high modularity) a Community is defined as Network Community Detection.

[Fig.3-7-4] Community partition clustering in the sense of spatial equilibrium

3.7.3. What is Modularity?

Here, we will learn more about the definition of Modularity, a concept that can quantify the concentration of Network Nodes.

First, the definition of Modularity quantifies the degree to which Nodes are concentrated based on Links in a network structure. The purpose is to measure the structure of a structurally divided network (graph), and the degree of modularity of the network can be expressed as a value of -1 to 1. A value of around 0.3 to 0.7 can be said to indicate a significant cluster.

It can be said to be an Index with a large value when there are many connections within a distinct group within the network and few connections between groups.

The definition of the Index is defined as the ratio of the density of [links within a community] and [links between communities].

For example, a network with high modularity has dense connections between nodes within a module, so the clustering ratio of the same area is high, and the connections between nodes in other modules are sparse, indicating clustering in other areas, which is expressed as a modularity value close to 0.

Modularity

Quantifies the degree to which nodes are clustered based on links in a network structure
Measures the structure of a structurally divided network (graph)
Indicates the degree of modularity of a network with a value of -1 to 1 (significant clustering of 0.3 to 0.7)
A measure that indicates the property that there are many connections within a distinct group in a network and few connections between groups
: Defined as the ratio of the density of [inter-community links] and [inter-community links]

* A network with high modularity has dense connections between nodes within a module (clustering in the same area) but sparse connections between nodes in different modules (clustering in different areas).

3.7.4. Introduction to Louvain Algorithm

Among the algorithms that utilize modularity, we decided to verify the idea through the Louvain algorithm, a representative algorithm. The Louvain algorithm is divided into Phase 1 and Phase 2.

The purpose of Phase 1 is to maximize modularity. The method is to measure modularity by placing a node in another adjacent community and determine the cluster so that modularity is maximized.

The purpose of Phase 2 is to simplify the network by maximizing modularity in Phase 1. The link weights that were connected between existing communities are merged into a single link, and the links between nodes in new communities are replaced with self-loops to simplify the network.

4. Results

4.1. Cumulative number of Seoul Bike rentals in Gangseo-gu according to D Index

The picture you see is a picture showing the cumulative number of Seoul Bike bicycles in Gangseo-gu by rental station location in June 2023, with a D Index of 0.95 to 1.05 and a range of 0.1.

Transparent circles are negative cumulative amounts, indicating places with a large number of rentals, and filled circles are places with a large number of returns. Also, the places with dots are rental stations excluded from the reallocation target.

Currently, only rental stations within the range of 0.1 are excluded, so there are many target rental stations, and the cost is higher than other cases with a large range. (Low reallocation efficiency)

When the range is as wide as 0.4, this is the case where there are the most rental stations excluded from relocation, and in this case, there are the fewest rental stations subject to relocation, so the cost of relocating Seoul Bike is greatly reduced, making it an efficient section.

[Fig.4-1-1] Gangseo-gu rental stations scheduled to relocate bikes (D Index : 0.1)

[Fig.4-1-2] Gangseo-gu rental stations scheduled to relocate bikes (D Index : 0.2)

[Fig.4-1-3] Gangseo-gu rental stations scheduled to relocate bikes (D Index : 0.3)

[Fig.4-1-4] Gangseo-gu rental stations scheduled to relocate bikes (D Index : 0.4)

4.2. Cumulative number of Seoul Bike rentals in Seoul according to D Index

The picture you see is a picture showing the cumulative number of Seoul bikes by rental station location in Seoul as of June 2023. The D index is 0.95~1.05, and the range is 0.1.

Transparent circles indicate places with a large number of rentals with negative cumulative numbers, and filled circles indicate places with a large number of returns. Also, the places with dots are rental stations excluded from the redistribution target.

Currently, only rental stations within the 0.1 range are excluded, so the cost is higher than other cases with many target rental stations and a large range. (Low redistribution efficiency)

[Fig.4-2-1] Seoul rental stations scheduled to relocate bikes (D Index : 0.1)

[Fig.4-2-2] Seoul rental stations scheduled to relocate bikes (D Index : 0.2)

[Fig.4-2-3] Seoul rental stations scheduled to relocate bikes (D Index : 0.3)

[Fig.4-2-4] Seoul rental stations scheduled to relocate bikes (D Index : 0.4)

In the case where the Index Range for the Rental Stations Excluding Relocation Targets is 0.4, the ratio of rental stations excluded from the relocation target for Seoul Bike is 66.6\%, and the ratio of rental stations subject to relocation is low at 33.4\%.

In this case, the cost of Seoul Bike relocation will be greatly reduced, and the relocation efficiency can also be the highest.

4.3. Results of spatial equilibrium implementation through application of Louvain algorithm

As explained in the idea above, the result of setting up Node and Edge using the rental and return data of Seoul Bike and applying the Louvain algorithm showed much better results than K-Means, which simply used the Euclidean coordinates of the rental station location.

When clustering, the cumulative average by region was used as an indicator of performance, and the closer it is to 0, the closer it is to the partial sum, which means that the bicycle is likely to move only within the clustering region.

K-Means is 21.19, while Louvain is 9.23. We were able to confirm that the cumulative average by region was reduced by almost half.

[Fig.4-3-1] Results of spatial equilibrium implementation through application of
Louvain algorithm

And the K-Means algorithm shows clustering that ignores the Han River, while the Louvain algorithm clustering seems to reflect the geographical characteristics of Seoul better.

The results below are the results of the Seoul Bike clustering classified by the K-Means and Louvain algorithms, colored on the map of Seoul.

While K-Means shows results that cannot distinguish the Han River, the Louvain algorithm shows the boundary of the Han River much more clearly.

[Fig.4-3-2] Clustering results comparison : Louvain Vs K-Means

[Fig.4-3-3] Louvain Result : Go work Time

[Fig.4-3-4] Louvain Result : Off work Time

References

1. F. Chiariotti, C. Pielli, A. Zanella and M. Zorzi, "A dynamic approach to rebalancing bike-sharing systems", Sensors

2. M. Dell’Amico, E. Hadjicostantinou, M. Iori and S. Novellani, "The bike sharing rebalancing problem: Mathematical formulations and benchmark instances"

3. P. Yi, F. Huang and J. Peng, "A rebalancing strategy for the imbalance problem in bike-sharing systems"

4. C. M. de Chardon, G. Caruso and I. Thomas, "Bike-share rebalancing strategies patterns and purpose"

5. C. Zhang, L. Zhang, Y. Liu and X. Yang, "Short-term prediction of bike-sharing usage considering public transport: A lstm approach"

6. S. Ruffieux, N. Spycher, E. Mugellini and O. A. Khaled, "Real-time usage forecasting for bike-sharing systems: A study on random forest and convolutional neural network applicability"

Fiscal Games in Overlapping Jurisdictions

Published

October 15, 2024

Jay Hyoung-Keun Kwon*

* Swiss Institute of Artificial Intelligence, Chaltenbodenstrasse 26, 8834 Schindellegi, Schwyz, Switzerland

PDF View PDF Download

I. Introduction

Recent developments in the urban landscape—such as sub-urbanization, counter-urbanization, and re-urbanization—have given rise to complicated scenarios where administrative jurisdictions are newly formed and intersect with existing ones. This presents unique challenges to traditional models of tax competition. This paper examines the economic implications of such overlapping jurisdictions, particularly focusing on their impact on tax policy.

Overlapping jurisdictions are characterized by multiple local governments exercising different levels of authority over the same geographic area. These arrangements often merge as responses to address urban sprawl or provide specialized services tailored to local needs. However, they also create a intricate network of fiscal relationships that can lead to inefficiencies in resource allocation and public service delivery.

For instance, local governments in the U.S., including counties, cities, towns, and special districts, have varying degrees of taxing powers. Residents in certain areas might be subject to property taxes levied by their town, county, school district, and special districts (such as fire or library districts), all operating within the same geographic space. This multi-layered governance structure not only affects residents' tax burdens but also influences local governments' decision-making processes regarding tax rates and public service provision. The resulting fiscal landscape provides a rich setting for examining the dynamics of tax competition and cooperation among overlapping jurisdictions.

Traditional models of tax competition, such as Wilson [4] and Zodrow and Mieszkowski [5], typically assume clear demarcations between competing jurisdictions. However, these models do not adequately capture the dynamics of overlapping administrative divisions. In such settings, local governments must navigate not only horizontal competition with neighboring jurisdictions but also a form of vertical competition within the same geographic space.

This paper aims to extend the literature on tax competition by developing a theoretical framework that accounts for the unique characteristics of overlapping jurisdictions. Specifically, we address the following research questions:

How do overlapping administrative divisions affect the strategic tax-setting behavior of local governments?
What are the implications of such overlapping structures for the provision of public goods and services?
How does the presence of overlapping jurisdictions influence the welfare outcomes predicted by traditional tax competition models?

To address these questions, we develop a game-theoretic model that incorporates multiple layers of local government operating within the same geographic space. This approach allows us to analyze the strategic interactions between overlapping jurisdictions and derive insights into the resulting equilibrium tax rates and levels of public good provision.

Our analysis contributes to the existing literature in several ways. First, it provides a formal framework for understanding tax competition in the context of overlapping jurisdictions, which is increasingly relevant in modern urban governance. Second, it offers insights into the potential inefficiencies that arise from such administrative structures and suggests possible policy interventions to mitigate these issues. Finally, it extends the theoretical foundations of fiscal federalism to account for more complex governance arrangements.

Our study addresses the current realities of fiscal federalism in developed economies as well as provides valuable insights for countries where local governments are yet to achieve significant fiscal autonomy. The lessons drawn from this analysis can inform policy discussions on decentralization, local governance structures, and intergovernmental fiscal relations in various contexts.

The remainder of this paper is organized as follows: Section II reviews the relevant literature on tax competition and fiscal federalism. Section III presents our theoretical model and derives key equilibrium results. Section IV discusses the implications of our findings for public policy and urban governance. Section V concludes and suggests directions for future research.

II. Literature Review

Tax competition has been one of the central themes in public economics. Tiebout [1]'s seminal work on local public goods laid the foundation for this field, proposing a model of "voting with feet" where residents moving to jurisdictions offering their preferred combination of taxes and public services. Oates [2] further developed these ideas and presented the decentralization theorem which posits that, under certain conditions, decentralized provision of public goods is welfare-maximizing.

Works of Wilson [4] and Zodrow and Mieszkowski [5] developed the basic tax competition model, where jurisdictions compete for a mobile capital tax base. This model predicts inefficiently low tax rates and underprovision of public goods, which is often referred to as the "race to the bottom." Wildasin [6] further demonstrated that the Nash equilibrium in tax rates is generally inefficient by incorporating strategic interactions between jurisdictions.

Researchers began to consider more diverse institutional settings. Keen and Kotsogiannis [11] analyzed the interaction between vertical tax competition (between different levels of government) and horizontal tax competition (between governments at the same level). Their work demonstrated that in federal systems, the tax rates can be high or low depending on the relative strength of vertical and horizontal tax externalities, contrary to the ``race to the bottom'' prediction of earlier models.

Itaya, Okamura, and Yamaguchi [12] examined tax coordination in a repeated game setting with asymmetric regions. They find that the sustainability of tax coordination depends on the degree of asymmetry between regions and the type of coordination--partial or full. While asymmetries can complicate coordination efforts, the repeated nature of interactions can facilitate cooperation under certain conditions. Their work demonstrates that full tax coordination can be sustained for a wider range of parameters compared to partial coordination.

Building on this, Ogawa and Wang [14] incorporated fiscal equalization into the framework of asymmetric tax competition in a repeated game context. Their findings reveal that fiscal equalization can influence the sustainability of tax coordination, sometimes making it more difficult to maintain. The impact of equalization schemes on tax coordination is contingent on the degree of regional asymmetry and the specific parameters of the equalization policy.

The case of overlapping jurisdictions represents a frontier in tax competition research. While not extensively studied, some works have begun to address such cases. Hochman, Pines, and Thisse [9] developed a model of metropolitan governance with overlapping jurisdictions, showing how this can lead to inefficiencies in public good provision. Esteller-Mor´e and Sol´e-Oll´e [10] analyzed tax mimicking in a setting with overlapping tax bases, finding evidence of both horizontal and vertical interactions.

Game-theoretic approaches have been instrumental in advancing our understanding of tax competition dynamics. Wildasin [6] pioneered the use of game theory in tax competition, modeling jurisdictions as strategic players in a non-cooperative game. This approach demonstrated that the Nash equilibrium in tax rates is generally inefficient, providing a formal basis for the ``race to the bottom'' hypothesis. The work of Itaya, Okamura, and Yamaguchi [12] and Ogawa and Wang [14] further extended this game-theoretic approach to repeated games, offering insights into the possibilities for tax coordination over time.

While these game-theoretic approaches have significantly advanced our understanding of tax competition, they have largely failed to address the complexities of fully overlapping jurisdictions. Most models assume clear boundaries between competing jurisdictions, leaving a gap in our understanding of scenarios where multiple levels of government have taxing authority over the same geographic area.

The welfare implications of tax competition have been a subject of ongoing debate. While the "race to the bottom" hypothesis suggests negative welfare consequences, some scholars have argued for potential benefits. Brennan and Buchanan [3] proposed that tax competition could serve as a check on the excessive growth of government, a view that has found some support in subsequent empirical work (e.g., [13]).

Policy responses to tax competition have also been extensively studied. Proposals range from tax harmonization [7] to the implementation of corrective subsidies [8]. The effectiveness of these measures, particularly in complex settings with overlapping jurisdictions, remains an active area of research.

While the literature on tax competition has made significant strides in understanding the dynamics of fiscal interactions between jurisdictions, several areas warrant further investigation. The case of fully overlapping jurisdictions, in particular, presents a rich opportunity for both theoretical modeling and empirical analysis. This study aims to fill in this gap by accounting for overlapping jurisdictions in traditional game-theoretic models of tax competition.

III. Model

This study extends the tax competition models of Itaya, Okamura, and Yamaguchi [12] and Ogawa and Wang [14] by introducing an overlapping jurisdiction. Our approach is grounded in the Solow growth model, which provides a robust framework for analyzing long-term economic growth and capital accumulation. The Solow model's emphasis on capital accumulation and technological progress makes it suitable for our analysis of tax competition, as these factors influence jurisdictions' tax bases and policy decisions.

The Solow model's assumptions of diminishing returns to capital and constant returns to scale align well with our focus on regional differences in capital endowments and production technologies. Moreover, its simplicity allows for tractable extensions to multi-jurisdiction settings.

A. Setup

We consider a country divided into three regions: two asymmetric regions, $S$ and $L$, and an overlapping region, $O$, which equally overlaps with $S$ and $L$. All regions have independent authority to impose capital taxes. This setup allows us to examine the interactions between horizontal tax competition (between $S$ and $L$) and the unique dynamics introduced by the overlapping jurisdiction $O$. Let us further denote that the regions of $S$ and $L$ that do not overlap with $O$ are $SS$(Sub-$S$) and $SL$(Sub-$L$), respectively, while those that overlap with $O$ are $OS$ and $OL$ (see Figure 1).

FIGURE 1. VISUAL EXPLANATION OF THE HYPOTHETICAL COUNTRY

Here, we make the following key assumptions:

Population: Population is evenly spread across the country. Hence, regions $S$ and $L$ have equal populations. Furthermore, regions $SS$, $SL$, and $O$ have equal populations. This assumption, while strong, allows us to isolate the effects of capital endowment and technology differences.
Labor Supply and Individual Preferences: Residents inelastically supply one unit of labor to firms in their region and have identical preferences. Furthermore, they strive to maximize their utilities given their budget constraints. While this assumption simplifies labor market dynamics, it is reasonable in the short to medium term, especially in areas with limited inter-regional mobility.
Production: Firms in each region produce homogeneous consumer goods and maximize their profits. This assumption allows us to focus on capital allocation without the complications of product differentiation.
Capital Mobility: Capital is perfectly mobile across regions, reflecting the ease of capital movement in economies, especially within a single country.
Asymmetric Endowments and Technology: Regions $S$ and $L$ differ in capital endowments and production technologies. This assumption captures real-world regional disparities and is crucial for generating meaningful tax competition dynamics.
Public Goods Provision: Regions $S$ and $L$ provide generic public goods $G$, while region $O$ provides specific public goods $H$ to the extent that maximizes their representative resident's utilities. This reflects the often-observed division of responsibilities between different levels of government.

These assumptions, while simplifying the real world, allow us to focus on the core mechanisms of tax competition in overlapping jurisdictions. They provide a tractable framework for analyzing the strategic interactions between jurisdictions while capturing key elements of real-world complexity.

B. Production and Capital Allocation

Let $\bar{k}_i$ be the capital endowment per capita for regions $i$ and $\bar{k}$ be the capital endowments per capita of the national economy. For regions $S, L$ and $O$, it can be expressed as follows:

\begin{align}
\bar{k}_{s} \equiv \bar{k} - \varepsilon,\ \ \ \ \ \ \ \ \ \ \bar{k}_{L} \equiv \bar{k} + \varepsilon,\ \ \ \ \ \ \ \ \ \ \bar{k}_O = \bar{k} \equiv \frac{\bar{k}_{s} + \bar{k}_{L}}{2}
\end{align}

where $\varepsilon \in \left( 0,\ \bar{k} \right\rbrack$ represents asymmetric endowments between regions $S$ and $L$. $\bar{k}_O = \bar{k}$ follows from the assumption that the population is evenly dispersed across the country.

Let $L_i$ and $K_i$ be the labor and capital inputs for production in region $i$. It can be easily inferred that

\begin{align}
l\equiv L_S = L_L , \ \ \ \ \ \ \ \ \ \ \frac{2}{3}l \equiv L_{SS} = L_{SL} = L_{O}.
\end{align}

Furthermore, we denote

\begin{align}
K_{SS} \equiv \alpha_S K_S, \ \ \ \ \ \ \ \ \ \ K_{SL} \equiv \alpha_L K_L
\end{align}

for $0 < \alpha_S, \alpha_L < 1$.

With the key variables defined, the production function for each region $i$ is given by:

\begin{align}
F_i(L_i, K_i) = A_i L_i + B_i K_i - \frac{K_i^2}{L_i}
\end{align}

where $A_i$ and $B_i > 2K_i / L_i$ represent labor and capital productivity coefficients, respectively. Although regions $S$ and $L$ differ in capital production technology, there is no difference in labor production technology, so $A_{L} = A_{S}$ while $B_{L} \neq \ B_{S}$. Note that this function exhibits constant returns to scale and diminishing returns to capital. Furthermore, we assume that sub-regions without overlaps ($SL$ and $SS$) have equivalent technology coefficients with their super-regions ($L$ and $S$). The technology parameter of the overlapping region is a weighted average of $B_S$ and $B_L$, where the weights are the proportion of capital invested from $S$ and $L$.

As mentioned above, capital allocation across regions is determined by profit-maximizing firms and the free movement of capital. Let $\tau_i$ be the effective tax rate for region $i$. Then, we can infer that the real wage rate $w_i$ and real interest rates $r_i$ are:

\begin{equation}
\begin{aligned}
w_i &= A_i + \left(\frac{K_i}{L_i}\right)^2 \\
r_i &= B_i - 2K_i/L_i - \tau_i - t_i
\end{aligned}
\end{equation}

where $t_i = (1-\alpha_i)\tau_O$ for $i \in \{S, L\}$, $0$ for $i\in \{SS, SL\}$, $\tau_S$ for $i = OS$, and $\tau_L$ for $i = OL$.

The capital market equilibrium for the national economy is reached when the sum of capital demands is equal to the exogenously fixed total capital endowment: $K_S + K_L = 2l\bar{k}$. In equilibrium, the interest rates and capital demanded in each region are as follows:
\begin{equation}
\begin{aligned}
&r^* = \frac{1}{2}\big(\left(B_S + B_L \right) - \left(\tau_S + \tau_L + (2-\alpha_S-\alpha_L)\tau_O\right)\big) - 2\bar{k} \\
&K_S^* = lk_S^* = l\bigg(\bar{k} + \frac{1}{4}\big( (\tau_L - \tau_S - (\alpha_L - \alpha_S)\tau_O ) - (B_L - B_S)\big) \bigg) \\
&K_L^* = lk_L^* = l\bigg(\bar{k} + \frac{1}{4}\big( (\tau_S - \tau_L + (\alpha_L - \alpha_S)\tau_O ) + (B_L - B_S)\big) \bigg) \\
&K_{SS}^* = \frac{2}{3}lk_{SS}^* = \frac{2l}{3}\bigg(\bar{k} + \frac{1}{4}\big( (\tau_L - \tau_S + (2 - \alpha_L - \alpha_S)\tau_O ) - (B_L - B_S)\big) \bigg) \\
&K_{SL}^* = \frac{2}{3}lk_{SL}^* =\frac{2l}{3}\bigg(\bar{k} + \frac{1}{4}\big( (\tau_S - \tau_L + (2 - \alpha_L - \alpha_S)\tau_O ) + (B_L - B_S)\big) \bigg) \\
&K_O^* = \frac{2}{3}lk_O^* = \frac{2l}{3}\left(\bar{k} - \frac{1}{2}(2-\alpha_S-\alpha_L)\tau_O\right)
\end{aligned}
\end{equation}
We denote $B_L - B_S = \theta$, henceforth.

C. Government Objectives and Tax Rates

Given that individuals in the country have identical preferences and inelastically supply one unit of labor to the regional firms, we can infer that all inhabitants receive a common return on capital of $r^*$ eventually, and they use all income to consume private good $c_i$. Hence, the budget constraint for an individual residing in region $i \in \{S, L, O\}$ and the sum of individuals in region $i$ will be

\begin{equation}
\begin{aligned}
c_i &= w_i^* + r^*\bar{k}_i \\
C_i &= \begin{cases}
l(w^*_i + r^*\bar{k}_i) & \text{ for } i \in \{S, L\}\\
\dfrac{2l}{3}(w^*_i + r^*\bar{k}) & \text{ for } i = O
\end{cases}
\end{aligned}
\end{equation}

In addition, we have assumed that the overlapping district is a special district providing a special public good--for example education or health--that the other two districts do not provide. $S$ and $L$ provide their local public goods $G_i$. Then the total public goods provided in region $i$ can be expressed as:

\begin{equation}
\begin{aligned}
G_i &= \begin{cases}
K_i^*\tau_i & \text{ for } i \in \{S, L\}\\
(1-\alpha_S)K_S^*\tau_S + (1-\alpha_L)K_L^*\tau_L & \text{ for } i = O
\end{cases} \\
H_i &= \begin{cases}
(1-\alpha_i)K_i^*\tau_O & \hspace{1.45in} \text{ for } i \in \{S, L\}\\
K_i^*\tau_i & \hspace{1.45in} \text{ for } i = O
\end{cases}
\end{aligned}
\end{equation}

Accordingly, each government in region $i$ chooses $\tau_i$ such that maximizes the following social welfare function, which is represented as the sum of individual consumption and public good provision:

\begin{equation}
\begin{aligned}
& u(C_i, G_i, H_i) \equiv C_i + G_i + H_i
\end{aligned}
\end{equation}

This objective function captures the trade-off faced by governments between attracting capital through lower tax rates and generating revenue for public goods provision. After solving equation (9), we obtain the reaction functions, i.e. the tax rates at the market equilibrium (see Appendix 1 for details):

\begin{equation}
\begin{aligned}
\tau_S^* &= \frac{4\varepsilon}{3} - \frac{\theta}{3} + \frac{\tau_L}{3} - \frac{2-3\alpha_S + \alpha_L}{3}\tau_O \\
\tau_L^* &= -\frac{4\varepsilon}{3} + \frac{\theta}{3} + \frac{\tau_S}{3} - \frac{2-3\alpha_L + \alpha_S}{3}\tau_O \\
\tau_O^* &= \frac{3(\alpha_L + \alpha_S)-4}{(2-(\alpha_L + \alpha_S))(\alpha_L + \alpha_S)}\bar{k} = \Gamma \bar{k}
\end{aligned}
\end{equation}

D. Nash Equilibrium Analysis

The tax rates derived in the previous section represent the optimal response functions for each region. These functions encapsulate each region's best strategy given the strategies of other regions, as each jurisdiction aims to maximize its social welfare function. In essence, these functions delineate the most advantageous tax rate for each region, contingent upon the tax rates set by other regions.

The existence of a Nash equilibrium is guaranteed in our model, as the slopes of the reaction functions are less than unity, satisfying the contraction mapping principle. This ensures that the iterative process of best responses converges to a unique equilibrium point given $\alpha_L$ and $\alpha_S$. The one-shot Nash equilibrium tax rates are given by (see Appendix 2 for details):

\begin{equation}
\begin{aligned}
\tau_S^N &= \varepsilon - \frac{\theta}{4} - \Gamma\left(1-\alpha_S \right)\bar{k} \\
\tau_L^N &= -\left(\varepsilon-\frac{\theta}{4}\right) - \Gamma\left(1-\alpha_L\right)\bar{k} \\
\tau_O^N &= \Gamma\bar{k}
\end{aligned}
\end{equation}

These equilibrium tax rates reveal several important insights. First, the tax rates of regions $S$ and $L$ are influenced by the asymmetry in capital endowments ($\varepsilon$) and productivity ($\theta$), as well as the presence of the overlapping jurisdiction $O$. Second, the overlapping jurisdiction's tax rate is solely determined by the average capital endowment ($\bar{k}$) and the proportion of resources allocated from S and L ($\alpha_S$ and $\alpha_L$). Third, When $\alpha_L + \alpha_S = 4/3$, we have $\tau_O^N = 0$, which effectively reduces our model to a scenario without the overlapping jurisdiction.

The Nash equilibrium also yields equilibrium values for the interest rate and capital demanded in each region:

\begin{equation}
\begin{aligned}
r^N &= \frac{1}{2}\left(B_S + B_L\right) - 2\bar{k} \\
K_S^N &= l\left(\bar{k} - \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right) = l\left(\bar{k}_S + \frac{1}{2}\left(\varepsilon - \frac{\theta}{4} \right)\right) \\
K_L^N &= l\left(\bar{k} + \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right) = l\left(\bar{k}_L - \frac{1}{2}\left(\varepsilon -
\frac{\theta}{4} \right)\right) \\
K_{SS}^N &= \frac{2l}{3}\left(\bar{k} + \frac{1}{2}\bar{k}\Gamma(1-\alpha_S) - \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)
\right) = \alpha_S K_S^N \\
K_{SL}^N &= \frac{2l}{3}\left(\bar{k} + \frac{1}{2}\bar{k}\Gamma(1-\alpha_L) + \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right) \right) = \alpha_L K_L^N \\
K_O^N &= \frac{l}{3}\cdot\frac{4 -\alpha_L - \alpha_S}{\alpha_L + \alpha_S}\bar{k}
\end{aligned}
\end{equation}
These equilibrium conditions lead to two key lemmas that characterize the behavior of our model:

LEMMA 1 (Net Capital Position): The sign of $\Phi \equiv \varepsilon - \frac{\theta}{4}$ determines the net capital position of regions S and L. When $\Phi > 0$, L is a net capital exporter and S is a net capital importer, and vice versa when $\Phi < 0$.
PROOF:
From equation (12), we can see that:
\begin{align*}
K_L^N - K_S^N &= l\left(\left(\bar{k} + \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right) - \left(\bar{k} - \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right)\right) \\
&= l\left(\varepsilon + \frac{\theta}{4}\right)
\end{align*}
The sign of this difference is determined by $\varepsilon - \frac{\theta}{4} \equiv \Phi$.

LEMMA 2 (Overlapping Jurisdiction’s Effectiveness): The sign of $\Gamma \equiv \frac{3\left( \alpha_{L} + \alpha_{S} \right) - 4}{\left( 2 - \left( \alpha_{L} + \alpha_{S} \right) \right)\left( \alpha_{L} + \alpha_{S} \right)}$ determines the effective tax rate of O. Moreover, $\alpha_{L} + \alpha_{S}$ must be greater than 4/3 for O to provide a positive sum of special public good H.
PROOF:
From equation (11), we see that the sign of $\tau_O^N$ is determined by the sign of $\Gamma$. The numerator of $\Gamma$ is positive when $\alpha_{L} + \alpha_{S} > 4/3$, and the denominator is always positive for $\alpha_{L} + \alpha_{S} < 2$. Therefore, $\Gamma > 0$ (and consequently $\tau_O^N > 0$) when $\alpha_{L} + \alpha_{S} > 4/3$.

These lemmas provide crucial insights into the dynamics of our model. First, the introduction of the overlapping jurisdiction $O$ does not alter the net capital positions of $S$ and $L$ compared to a scenario without $O$. The capital flow between $S$ and $L$ is determined solely by the relative strengths of their capital endowments ($\varepsilon$) and productivity differences ($\theta$). In addition, the effectiveness of the overlapping jurisdiction in providing public goods is contingent on receiving a sufficient allocation of resources from $S$ and $L$.

These findings contribute to our understanding of tax competition in multi-layered jurisdictional settings and provide a foundation for analyzing the welfare implications of overlapping administrative structures.

IV. Simulations and Results

To better understand the implications of our theoretical model and address the research questions posed in the introduction, we conducted a series of simulations. These simulations allow us to visualize the non-linear relationships between key variables and provide insights into the strategic behavior of jurisdictions in our overlapping tax competition model.

A. Net Capital Positions and Tax Competition Dynamics

Our first simulation focuses on the net capital positions of regions $S$ and $L$, as determined by the parameter $\Phi \equiv \varepsilon - \theta/4$. Figure 2 illustrates how changes in $\Phi$ affect the capital demanded by each region.

As shown in Figure 2, when $\Phi > 0$, region $L$ becomes a net capital exporter, while region $S$ becomes a net capital importer. This result directly addresses our first research question about how overlapping administrative divisions affect strategic tax-setting behavior. The presence of the overlapping jurisdiction $O$ does not alter the net capital positions of $S$ and $L$ compared to a scenario without $O$.

However, it does influence their tax-setting strategies, as evidenced by the Nash equilibrium tax rates in equation (12). These equations show that $S$ and $L$ adjust their tax rates in response to the overlapping jurisdiction $O$ by factors of $\Gamma(1-\alpha_S)\bar{k}$ and $\Gamma(1-\alpha_L)\bar{k}$, respectively. This strategic adjustment demonstrates how the presence of an overlapping jurisdiction alters tax-setting behavior, even when it doesn't change net capital positions.

FIGURE 3. REPRESENTATIVE RESIDENTS’ UTILITY FROM PUBLIC GOODS

B. Public Good Provision and Welfare Implications

Then, we examine the utility derived from public goods by representative residents in each region. Figure 3 visualizes these utilities across different values of $\Phi$ and $\tau_O$.

Figure 3 reveals several important insights. First, the utility derived from public goods varies significantly across sub-regions ($SS$, $SL$, $OS$, $OL$), highlighting the complex welfare implications of overlapping jurisdictions. Second, the overlapping region $O$'s tax rate ($\tau_O$) has a substantial impact on the utility derived from public goods, especially in the overlapping sub-regions $OS$ and $OL$. Third, the relationship between $\Phi$ and public good utility is non-linear and differs across regions, suggesting that the welfare implications of tax competition are not uniform. These findings suggest that the presence of an overlapping jurisdiction can lead to heterogeneous welfare effects.

C. The Role of the Overlapping Jurisdiction

Our model and simulations highlight the crucial role played by the overlapping jurisdiction $O$. The tax rate of $O$ ($\tau_O^N = \Gamma\bar{k}$) is determined by the proportion of resources allocated from the primary economies ($\alpha_S$ and $\alpha_L$), or specifically $\Gamma$. This relationship reveals that the overlapping jurisdiction's ability to provide public goods ($H$) is contingent on receiving a sufficient proportion of resources from $S$ and $L$. Specifically, $\alpha_L + \alpha_S$ must exceed $4/3$ for $O$ to provide a positive sum of special public goods.

This finding has important implications for the design of multi-tiered governance systems. It suggests that overlapping jurisdictions need a critical mass of resource allocation to function effectively, which may inform decisions about the creation and empowerment of special-purpose districts or other overlapping administrative structures.

V. Conclusion

This study has examined the dynamics of tax competition in regions with overlapping tax jurisdictions, leveraging game theory to develop a theoretical framework for understanding these administrative structures. By constructing a simplified model and deriving Nash equilibrium conditions, we have identified several key insights that contribute to the existing literature on tax competition. Our analysis reveals that while the introduction of an overlapping jurisdiction does not alter the net capital positions of the primary regions, it leads to strategic adjustments in tax rates. This finding extends the traditional models of tax competition by incorporating the complexities of multi-tiered governance structures.

The effectiveness of the overlapping jurisdiction in providing public goods is found to be contingent on receiving a sufficient allocation of resources from the primary regions. Moreover, our simulations demonstrate that the presence of overlapping jurisdictions can lead to heterogeneous welfare effects across sub-regions, challenging the uniform predictions of traditional tax competition models and suggesting the need for more nuanced policy approaches.

While our study provides valuable insights, it is important to acknowledge its limitations. The use of a simplified model, while allowing for tractable analysis, inevitably omits some real-world complexities. Factors such as population mobility, diverse tax bases, and income disparities among residents were not incorporated into the model. Furthermore, our analysis is static, which may not capture the dynamic nature of tax competition and capital flows over time. The assumption of identical preferences for public goods across all residents may not reflect the heterogeneity of preferences in real-world settings. Additionally, the model assumes perfect information among all players, which may not hold in practice where information asymmetries can influence strategic decisions.

To address these limitations and further advance our understanding of tax competition in a wide array of administrative structures, several avenues for future research are proposed. Developing dynamic models that capture the evolution of tax competition over time, potentially using differential game theory approaches, could provide insights into the long-term implications of overlapping jurisdictions. Incorporating heterogeneous preferences for public goods among residents would allow for a more nuanced examination of how diverse citizen demands affect tax competition and public good provision in overlapping jurisdictions.

Empirical studies using data from regions with overlapping jurisdictions, such as special districts in the United States, could test the predictions of our theoretical model and provide valuable real-world validation. Extending the model to include various policy interventions, such as intergovernmental transfers or tax harmonization efforts, could help evaluate their effectiveness in mitigating potential inefficiencies. Incorporating insights from behavioral economics to account for bounded rationality and other cognitive factors may provide a more realistic representation of tax-setting behavior in complex jurisdictional settings.

In conclusion, this study provides a theoretical foundation for understanding tax competition in regions with overlapping jurisdictions. By highlighting the complex interactions between multiple layers of government, our findings contribute to the broader literature on fiscal federalism and public economics. As urbanization continues and governance structures become increasingly complex, the insights derived from this research can inform policy discussions on decentralization, local governance structures, and intergovernmental fiscal relations. Future work in this area has the potential to significantly enhance our understanding of modern urban governance and contribute to the development of more effective and equitable fiscal policies in multi-tiered administrative structures.

References

[1] C. M. Tiebout, “A pure theory of local expenditures,” Journal of Political Economy, vol. 64, no. 5, pp. 416–424, 1956.
[2] W. E. Oates, Fiscal federalism. Harcourt Brace Jovanovich, 1972.
[3] G. Brennan and J. M. Buchanan, The power to tax: Analytical foundations of a fiscal constitution. Cambridge University Press, 1980.
[4] J. D. Wilson, “A theory of interregional tax competition,” Journal of Urban Economics, vol. 19, no. 3, pp. 296–315, 1986.
[5] G. R. Zodrow and P. Mieszkowski, “Pigou, tiebout, property taxation, and the underprovision of local public goods,” Journal of Urban Economics, vol. 19, no. 3, pp. 356–370, 1986.
[6] D. E. Wildasin, “Nash equilibria in models of fiscal competition,” Journal of Public Economics, vol. 35, no. 2, pp. 229–240, 1988.
[7] R. Kanbur and M. Keen, “Jeux sans fronti`eres: Tax competition and tax coordination when countries differ in size,” American Economic Review, pp. 877–892, 1993.
[8] J. A. DePater and G. M. Myers, “Strategic capital tax competition: A pecuniary externality and a corrective device,” Journal of Urban Economics, vol. 36, no. 1, pp. 66–78, 1994.
[9] O. Hochman, D. Pines, and J.-F. Thisse, “On the optimality of local government: The effects of metropolitan spatial structure,” Journal of Economic Theory, vol. 65, no. 2, pp. 334–363, 1995.
[10] A. Esteller-Mor´e and A. Sol´e-Oll´e, “Vertical income tax externalities and fiscal interdependence: Evidence from the us,” Regional Science and Urban Economics, vol. 31, no. 2-3, pp. 247–272, 2001.
[11] M. Keen and C. Kotsogiannis, “Does federalism lead to excessively high taxes?” American Economic Review, vol. 92, no. 1, pp. 363–370, 2002.
[12] J.-i. Itaya, M. Okamura, and C. Yamaguchi, “Are regional asymmetries detrimental to tax coordination in a repeated game setting?” Journal of Public Economics, vol. 92, no. 12, pp. 2403–2411, 2008.
[13] L. P. Feld, G. Kirchg¨assner, and C. A. Schaltegger, “Decentralized taxation and the size of government: Evidence from swiss state and local governments,” Southern Economic Journal, vol. 77, no. 1, pp. 27–48, 2010.
[14] H. Ogawa and W. Wang, “Asymmetric tax competition and fiscal equalization in a repeated game setting,” International Tax and Public Finance, vol. 23, no. 6, pp. 1035–1064, 2016.

APPENDIX 1 - DERIVING REACTION FUNCTIONS

Let us get the partial derivatives that are needed to get the first order condition of social utility function. Starting with the easier ones,

\begin{align*}
&\frac{\partial K^*_S}{\partial \tau_S} = -\frac{l}{4}, \quad \frac{\partial K^*_L}{\partial \tau_L} = -\frac{l}{4}
, \quad \frac{\partial K^*_O}{\partial \tau_O} = -\frac{l}{3}\left(2-\alpha_L - \alpha_S\right).
\end{align*}

Partial differentiation of $r^*$ with respect to the tax rates are

\begin{align*}
\frac{\partial r^*}{\partial \tau_S} = -\frac{1}{2}, \quad \frac{\partial r^*}{\partial \tau_L} = -\frac{1}{2}, \quad \frac{\partial r^*}{\partial \tau_O} = -\frac{2- (\alpha_L + \alpha_S)}{2}.
\end{align*}

Then, the partial differentiation of $w^*_i$ with respect to respective tax rates are:

\begin{align*}
\frac{\partial w^*_S}{\partial \tau_S} &= 2 \left(\frac{K^*_S}{l}\right)\cdot \frac{\partial K_S^*/l}{\partial \tau_S} = -\frac{K^*_S}{2l} \\
\frac{\partial w^*_L}{\partial \tau_L} &= 2 \left(\frac{K^*_L}{l}\right)\cdot \frac{\partial K_L^*/l}{\partial \tau_L} = -\frac{K^*_L}{2l} \\
\frac{\partial w^*_O}{\partial \tau_O} &= 3 \left(\frac{K^*_O}{l}\right)\cdot \frac{\partial 3K_O^*/2l}{\partial \tau_O} = -\frac{3(2-\alpha_L - \alpha_S)K_O^*}{2l}.
\end{align*}

Furthermore,

\begin{align*}
&\frac{\partial K^*_S\tau_S}{\partial \tau_S} = K^*_S -\frac{l}{4}\tau_S, \quad \frac{\partial K^*_L\tau_L}{\partial \tau_L} = K^*_L -\frac{l}{4}\tau_L, \quad \frac{\partial K^*_O\tau_O}{\partial \tau_O} = K^*_O -\frac{l}{3}\left(2-\alpha_L - \alpha_S\right)\tau_O.
\end{align*}

Lastly, we have

\begin{align*}
\frac{\partial(1-\alpha_S)\tau_O K^*_S}{\partial \tau_S} &= - \frac{l}{4}(1-\alpha_S)\tau_O \\
\frac{\partial(1-\alpha_L)\tau_O K^*_L}{\partial \tau_L} &= - \frac{l}{4}(1-\alpha_L)\tau_O
\end{align*}

Summing up, the first order condition for the social utility functions of region $S$ and $L$ are:

\begin{align*}
\frac{\partial U_S}{\partial \tau_S} = l\left(-\frac{K^*_S}{2l} -\frac{\bar{k}_S}{2}\right) + K^*_S -\frac{l}{4}\tau_S - \frac{l}{4}(1-\alpha_S)\tau_O = 0 \\
\frac{\partial U_L}{\partial \tau_L} = l\left(-\frac{K^*_L}{2l} -\frac{\bar{k}_L}{2}\right) + K^*_L -\frac{l}{4}\tau_L - \frac{l}{4}(1-\alpha_L)\tau_O = 0
\end{align*}

Rearranging the terms, we see that

\begin{align*}
\tau_S &= -(1-\alpha_S)\tau_O + 2\left(k_S^* -\bar{k}_S\right) \\
&= -(1-\alpha_S)\tau_O + 2\bigg(\varepsilon + \frac{1}{4}\big((\tau_L - \tau_S - (\alpha_L - \alpha_S)\tau_O ) - (B_L - B_S)\big) \bigg) \\
&\iff \tau_S^* = \frac{4\varepsilon}{3} - \frac{\theta}{3} + \frac{\tau_L}{3} - \frac{2-3\alpha_S + \alpha_L}{3}\tau_O \\
\tau_L &= -(1-\alpha_L)\tau_O + 2\left(k_L^* -\bar{k}_L\right) \\
&= -(1-\alpha_L)\tau_O + 2\bigg(-\varepsilon + \frac{1}{4}\big((\tau_S - \tau_L + (\alpha_L - \alpha_S)\tau_O ) + (B_L - B_S)\big) \bigg) \\
&\iff \tau_L^* = -\frac{4\varepsilon}{3} + \frac{\theta}{3} + \frac{\tau_S}{3} - \frac{2-3\alpha_L + \alpha_S}{3}\tau_O.
\end{align*}

The FOC for the social utility function of region $O$ is:

\begin{align*}
&\frac{2l}{3}\left(-\frac{3(2-\alpha_L - \alpha_S)K_O^*}{2l} - \frac{2- (\alpha_L + \alpha_S)}{2}\bar{k}\right) + K^*_O -\frac{l}{3}\left(2-\alpha_L - \alpha_S\right)\tau_O = 0 \\
& \iff \tau_O^* = \frac{3(\alpha_L + \alpha_S)-4}{(2-\alpha_L - \alpha_S)(\alpha_L + \alpha_S)}\bar{k} = \Gamma \bar{k}%= \frac{2-3\alpha}{\alpha(2-\alpha)}\bar{k}
\end{align*}

APPENDIX 2 - DERIVING NASH EQUILIBRIUM

Let $\gamma_S$ and $\gamma_L$ be $\Gamma \cdot (2-3\alpha_S + \alpha_L)/3$ and $\Gamma \cdot (2-3\alpha_L + \alpha_S)/3$, respectively. Then,

\begin{align*}
\tau_S &= \frac{4\varepsilon}{3} - \frac{\theta}{3} + \frac{1}{3}\left(-\frac{4\varepsilon}{3} + \frac{\theta}{3} + \frac{\tau_S}{3} - \gamma_L\bar{k}\right) - \gamma_S\bar{k} \\
&= \frac{8\varepsilon}{9} - \frac{2\theta}{9} + \frac{1}{9}\tau_S - \left(\frac{\gamma_L}{3} + \gamma_S\right)\bar{k}\\
&\iff \tau_S^N = \varepsilon - \frac{\theta}{4} - \Gamma\left(1-\alpha_S \right)\bar{k}\\
\tau_L^N &= -\left(\varepsilon-\frac{\theta}{4}\right) - \Gamma\left(1-\alpha_L\right)\bar{k}\\
\tau_O^N &= \Gamma\bar{k}
\end{align*}

It follows that

\begin{align*}
\tau_L^N - \tau_S^N &= -2\left(\varepsilon - \frac{\theta}{4}\right) + \Gamma(\alpha_L - \alpha_S)\bar{k}\\
\tau_S^N - \tau_L^N &= 2\left(\varepsilon - \frac{\theta}{4}\right) + \Gamma(\alpha_S - \alpha_L)\bar{k}
\end{align*}

Plugging them in, the interest rates and capital demanded in each region are:

\begin{align*}
r^N &= \frac{1}{2}\left(B_S + B_L\right) - 2\bar{k}\\
K_S^N &= l\left(\bar{k} - \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right) = l\left(\bar{k}_S + \frac{1}{2}\left(\varepsilon - \frac{\theta}{4} \right)\right)\\
K_L^N &= l\left(\bar{k} + \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right)\right) = l\left(\bar{k}_L - \frac{1}{2}\left(\varepsilon - \frac{\theta}{4} \right)\right)\\
K_{SS}^N &= \frac{2l}{3}\left(\bar{k} + \frac{1}{2}\bar{k}\Gamma(1-\alpha_S) - \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right) \right)\\
K_{SL}^N &= \frac{2l}{3}\left(\bar{k} + \frac{1}{2}\bar{k}\Gamma(1-\alpha_L) + \frac{1}{2}\left(\varepsilon + \frac{\theta}{4} \right) \right)\\
K_O^N &= \frac{l}{3}\cdot\frac{4 -\alpha_L - \alpha_S}{\alpha_L + \alpha_S}\bar{k}
\end{align*}

How is Korea’s Blood Supply Maintained? - Effects of the COVID-19 Pandemic, Blood Shortage Periods, and Promotions on Blood Supply Dynamics

Published

August 5, 2024

Donggyu Kim*

* Swiss Institute of Artificial Intelligence, Chaltenbodenstrasse 26, 8834 Schindellegi, Schwyz, Switzerland

Abstract

This study quantitatively assesses the effects of the COVID-19 pandemic, blood shortage periods, and promotional activities on blood supply and usage in Korea. Multiple linear regression analysis was conducted using daily blood supply, usage, and stock data from 2018 to 2023, incorporating various control variables. Findings revealed that blood supply decreased by 5.11% and blood usage decreased by 4.25% during the pandemic. During blood shortage periods, blood supply increased by 3.96%, while blood usage decreased by 1.98%. Although the signs of the estimated coefficients aligned with those of a previous study [1], their magnitudes differed. Promotional activities had positive effects on blood donation across all groups, but the magnitude of the impact varied by region and gender. Special promotions offering sports viewing tickets were particularly effective. This study illustrates the necessity of controlling exogenous variables to accurately measure their effects on blood supply and usage, which are influenced by various social factors. The findings underscore the importance of systematic promotion planning and suggest the need for tailored strategies across different regions and demographic groups to maintain a stable blood supply.

PDF View PDF Download

1. Introduction

Blood transfusion is an essential treatment method used in emergencies, for certain diseases, and during surgeries. Blood required for transfusion cannot be substituted with other substances and has a short shelf life. Additionally, it experiences demand surges that are difficult to predict. Therefore, systematic management of blood stock is required. Blood stock is determined by blood usage and the number of blood donors. Thus, understanding the relationship between usage and supply, as well as the effect of promotions, is essential for effective blood stock management.

There is a paucity of quantitative research on blood supply dynamics during crises and the effects of promotional activities. This study aims to address this research gap. Previous studies in Korea on blood donation have primarily focused on qualitative analysis to identify motives for blood donation through surveys.[2][3][4][5] Kim (2015)[6] used multiple linear regression analysis to predict the number of donations for individual donors but used the personal information of each donor as explanatory variables and did not consider time series characteristics. For this reason, understanding the dynamics of the total number of donors was difficult. Kim (2023)[1] studied the impact of the COVID-19 epidemic on the number of blood donations but did not control exogenous variables and types of blood donation.

This study aims to quantify the effects of the COVID-19 pandemic and blood shortage periods on blood supply and usage. Additionally, it measures the quantitative effects of various promotions on the number of blood donors. To achieve these objectives, regression analysis was utilized with control variables, enabling more precise estimations than previous studies. Based on these findings, this paper proposes effective blood management methods.

2. Methodology

2.1. Research Subjects

According to the Blood Management Act[7], blood can be collected at medical institutions, the Korean Red Cross Blood Services, and the Hanmaeum Blood Center. According to the Annual Statistical Report[8], blood donations conducted by the Korean Red Cross accounted for 92% of all blood donations in 2022. This study uses blood donation data from the Korean Red Cross Blood Services, which accounts for the majority of the blood supply.

2.2. Data Sources

The data for the number of blood donors by location utilized in this study were obtained from the Annual Statistical Report on Blood Services.[8] The daily data on the number of blood donors, blood usage, blood stock, and promotion dates were provided by the Korean Red Cross Blood Services[9].

The data for the number of blood donors, blood usage, and blood stock used in the study cover the period from January 1, 2018, to July 31, 2023, and the promotion date data cover the period from January 1, 2021, to July 31, 2023. Temperature and precipitation data were obtained from the Korea Meteorological Administration’s Automated Synoptic Observing System (ASOS) [10].

2.3. Variable Definitions

In this study, the number of blood donors, or blood supply, is defined as the number of whole blood donations at the Korean Red Cross. The COVID-19 pandemic period is defined as the duration from January 20, 2020, (the first case in Korea) to March 1, 2022 (the end of the vaccine pass operation). Red blood cell product stock is defined as the sum of concentrated red blood cells, washed red blood cells, leukocyte-reduced red blood cells, and leukocyte-filtered red blood cell stock.

Blood usage is based on the quantity of red blood cell products supplied by the Korean Red Cross to medical institutions. The regions in the study are divided according to the jurisdictions of the blood centers under the Korean Red Cross Blood Service and do not necessarily coincide with Korea’s administrative districts. Data from the Seoul Central, Seoul Southern, and Seoul Eastern Blood Centers were integrated and used as Seoul.

Weather information for each region is based on measurements from the nearest weather observation station to the blood center, and the corresponding Observation point numbers for each region are listed in Table 1.
Public holidays are based on the Special Day Information[11] from the Korea Astronomy and Space Science Institute.

\begin{array}{l|r}
\hline
\textbf{Blood center name} & \textbf{Observation station number} \\
\hline
\text{Seoul} & 108 \\
\text{Busan} & 159 \\
\text{Daegu/Gyeongbuk} & 143 \\
\text{Incheon} & 112 \\
\text{Ulsan} & 152 \\
\text{Gyeonggi} & 119 \\
\text{Gangwon} & 101 \\
\text{Chungbuk} & 131 \\
\text{Daejeon/Sejong/Chungnam} & 133 \\
\text{Jeonbuk} & 146 \\
\text{Gwangju/Jeonnam} & 156 \\
\text{Gyeongnam} & 255 \\
\text{Jeju} & 184 \\
\hline
\end{array}

Table 1. Observation Station number for each Blood Center

2.4. Variable Composition

Dependent Variable Plasma donations, 67% of which are used for pharmaceutical raw materials[8], can be imported due to their long shelf life of 1 year.[7] Also, in the case of platelet and multi-component blood donation, 95% of donors are male[8], and a decent number of days have no female blood donors, affecting the analysis. For these reasons, this study used the number of whole-blood donors as the target variable. Additionally, the amount of collected blood is determined by an individual’s physical condition[7], not by preference. Therefore, it is integrated within gender groups.

Explanatory Variables Considering the differences in operating hours of blood donation centers on weekdays, weekends, and holidays as shown in Figure 1, variables indicating the day of week and holiday were considered.
Table 2 shows that there are differences between regions on blood donation. Considering this, region was used as a control variable.

Figure 1. Distribution of number of blood donors by day of the week and holiday

The annual seasonal effects are not controlled by the holidays variable alone. for this reason, Fourier terms were introduced as explanatory variables [12].

\begin{array}{l|r|r|r}
\hline
\textbf{Blood center name}& \textbf{Population} & \textbf{Blood donation} & \textbf{Blood donation ratio (%)} \\
\hline
\text{Total} & 51,439,038 & 2,649,007 & 5.1 \\
\hline
\text{Seoul} & 9,428,372 & 846,646 & 9.0 \\
\text{Busan} & 3,317,812 & 204,250 & 6.2 \\
\text{Daegu/Gyeongbuk} & 4,964,183 & 225,245 & 4.5 \\
\text{Incheon} & 2,967,314 & 170,777 & 5.8 \\
\text{Ulsan} & 13,589,432 & 217,008 & 1.6 \\
\text{Gyeonggi} & 1,536,498 & 124,866 & 8.1 \\
\text{Gangwon} & 1,595,058 & 83,820 & 5.3 \\
\text{Chungbuk} & 3,962,700 & 237,169 & 6.0 \\
\text{Daejeon/Sejong/Chungnam} & 1,769,607 & 96,992 & 5.5 \\
\text{Jeonbuk} & 3,248,747 & 189,559 & 5.8 \\
\text{Gwangju/Jeonnam} & 3,280,493 & 123,250 & 3.8 \\
\text{Gyeongnam} & 1,110,663 & 87,677 & 7.9 \\
\text{Jeju} & 678,159 & 41,748 & 6.2 \\
\hline
\end{array}

Table 2. Blood donation rate by region[8]

$$X_{sin_{ij}} = \sin \left( \frac{2\pi \operatorname{doy}(bd)_{i}}{365}j \right),\quad X_{cos_{ij}} = \cos\left( \frac{2\pi \operatorname{doy}(bd)_{i}}{365}j \right)$$

Where $\operatorname{doy}\text{(01-01-yyyy)} = 0,\dots, \operatorname{doy}\text{(12-31-yyyy)} = 364$

$j$ is a hyperparameter setting the number of Fourier terms added. $j = 1,\dots,6$ terms with optimal AIC[13] were added to the model.

According to Table 2, 70% of all blood donors visit blood donation centers to donate. Therefore, to control the influence of weather conditions[14] that affect the pedestrian volume and the number of blood donors, a precipitation variable was included. Meanwhile, the temperature variable, which has a strong relationship with the season, was already controlled by Fourier terms and was found to be insignificant, so it was excluded from the explanatory variables.

\begin{array}{l|r}
\hline
\textbf{Blood donation place} & \textbf{Ratio} \\
\hline
\text{Individual, blood donation center} & 74.8\% \\
\text{Individual, street} & 0.3\% \\
\text{Group} & 24.9\% \\
\hline
\end{array}

Table 3. Ratio of blood donation by place

The Public-Private Joint Blood Supply Crisis Response Manual[15] specifies blood usage control measures during crises (Table 4). To reflect the effect of crises in the model, a blood shortage day variable was created and introduced as a proxy for the crisis stage.

A blood shortage day is defined as a day when the daily blood stock is three times less than the average daily usage in the previous year, as well as the following 7 days. The mathematical expression is as follows:

\begin{array}{l|r}
\hline
\textbf{Category} & \textbf{Criteria} \\
\hline
\text{Interest} & (bs) < 5 \times (bu)_{prev\, year} \\
\text{Cautious} & (bs) < 3 \times (bu)_{prev\, year} \\
\text{Alert} & (bs) < 2 \times (bu)_{prev\, year} \\
\text{Serious} & (bs) < 1 \times (bu)_{prev\, year} \\
\hline
\end{array}

Table 4. Criteria for blood supply crisis stage[15]

Where $(bs)$ is the blood stock and $(bu)$ is the blood usage

$$(bu)_{prev\, year}[(bu)_{i}] := \frac{1}{365}\sum_{\{j|\operatorname{year}[(bu)_{j}]= \operatorname{year}[(bu)_{i}]-1\}}(bu)_{j}$$

$$C_{i} := \mathbf{1}\left[ (bs)_{i} < 3 \times (bu)_{prev\, year}[(bu)_{i}] \right]$$

$$D_{short, i}:= \operatorname{sgn}[\sum\limits_{j=i-6}^{i} C_{j}]$$

According to Figure 2, population movement decreased during the COVID-19 pandemic, and blood donation was not allowed for a certain period after COVID recovery or vaccination.[7] Bae (2021)[16] showed factors that affected the decrease in blood donors during this period. Kim (2023)[1] also showed that there was a decrease in blood donation during this period. To control the effect of the pandemic, COVID-19 pandemic period dummy variables were introduced.

2.5. Research Method

To analyze the effects of promotions according to region and gender, this study conducted multiple regression analyses. The OLS model was chosen because it clearly shows the linear relationship between variables, and the results are highly interpretable.

Control variables were used to derive accurate estimates. The model’s accuracy was confirmed through the estimated effect of the COVID-19 and blood shortage period variables, consistent with the prior studies.

\begin{array}{l|r}
\hline
\textbf{Variable Name} & \textbf{Description} \\
\hline
D_{dow_{ij}} = \mathbf{1}[\operatorname{dow}(bd_{i}) = j] & \text{Day of week dummies} \\
\text{where} \operatorname{dow}\text{(monday)}=0, \dots, \operatorname{dow}\text{(sunday)}=6 & \\
\hline
D_{hol} & \text{Holiday dummy} \\
\hline
D_{short} & \text{Shortage day dummy} \\
\hline
D_{cov\, week} & \text{COVID pandemic weekday dummy} \\
\hline
D_{cov\, sat} & \text{COVID pandemic Saturday dummy} \\
\hline
D_{cov\, sun} & \text{COVID pandemic Sunday dummy} \\
\hline
X_{sin_{ij}} = \sin\left(\frac{2\pi \operatorname{doy}(bd_{i})}{365}\right) \\
X_{cos_{ij}} = \cos\left(\frac{2\pi \operatorname{doy}(bd_{i})}{365}\right) & \text{Fourier terms for yearly seasonality} \\
\text{where} \operatorname{doy}\text{(01-01-yyyy)} = 0, \dots, \operatorname{doy}\text{(12-31-yyyy)} = 364 & \\
\hline
X_{rain\, fall, r} & \text{Precipitation in region r} \\
\hline
D_{promo} & \text{Promotion dummy} \\
\hline
D_{special\, promo} & \text{Special promotion dummy} \\
\hline
\end{array}

Table 5. Variable Description

The proposed model for the number of blood donors is expressed as an OLS model, as shown in Equation (1). The model for blood usage is also defined using the same explanatory variables.

\begin{equation}\begin{aligned}
bd_{i} &= \sum\limits_{j=0}^{6}\beta_{dow_{j}}D{dow_{ij}} + \beta_{h}D_{hol} + \beta_{s}D_{short}\\
&+ \beta_{cw}D_{cov\, week_{i}} + \beta_{c\, sat}D_{cov\, sat_{i}} + \beta_{c\, sun}D_{cov\, sun_{i}}\\
&+ \sum\limits_{j=1}^{7}(\beta_{cos_{j}}X_{cos_{ij}} + \beta_{sin_{j}}X_{sin_{ij}}) + D_{promo} + D_{special\, promo}
\end{aligned}\tag{1}\label{eq1}
\end{equation}

The model considering regional characteristics is shown in Equation \eqref{eqn:bd_region}, where $r$ represents each region.

\begin{equation}\begin{aligned}\label{eqn:bd_region}
bd_{i,r} &= \sum\limits_{j=0}^{6}\beta_{dow_{j, r}}D_{dow_{ij}} + \beta{h, r}D_{hol} + \beta_{s, r}D_{short}\\
&+ \beta_{cw, r}D_{cov\, week_{i}} + \beta{c\, sat, r}D_{cov\, sat_{i}} + \beta{c\, sun, r}D_{cov\, sun_{i}}\\
&+ \sum\limits_{j=1}^{7}(\beta_{\cos_{j}, r}X_{\cos_{ij}} + \beta_{\sin_{j}, r}X_{\sin_{ij}}) + X_{rain\, fall, r} + D_{promo} + D_{special\, promo}
\end{aligned}\tag{2}\label{eq2}
\end{equation}

3. Result and Discussion

The regression results presented in Tables 6 and 7 reveal patterns in blood supply and usage dynamics. The day of the week significantly influences both supply and demand, also holidays have a substantial negative impact. The high R-squared value (0.902) of the blood usage model suggests that it accounts for most of the variability in blood usage.

\begin{array}{lcccc}
\hline
& & \textbf{Blood Supply Model Summary} & & \\
\hline
\text{R-squared} & 0.657 & \text{Adj. R-squared} & 0.653 \\
\hline
& coef & std err & t-value & \texttt{P>|t|} \\
mon & 5818.4625 & 52.319 & 111.211 & 0.000 \\
tue & 5660.4049 & 52.225 & 108.385 & 0.000 \\
wed & 5776.1600 & 52.270 & 110.507 & 0.000 \\
thu & 5704.9131 & 52.033 & 109.641 & 0.000 \\
fri & 6587.9064 & 52.282 & 126.008 & 0.000 \\
sat & 5072.4046 & 62.301 & 81.417 & 0.000 \\
sun & 3211.1523 & 62.386 & 51.472 & 0.000 \\
holiday & -3116.7659 & 92.486 & -33.700 & 0.000 \\
shortage & 214.1011 & 85.250 & 2.511 & 0.012 \\
cov\_weekday& -482.9943 & 45.439 & -10.629 & 0.000 \\
cov\_sat & 280.2871 & 101.498 & 2.762 & 0.006 \\
cov\_sun & 128.8915 & 100.873 & 1.278 & 0.201 \\
sin\_1 & -2.1102 & 26.913 & -0.078 & 0.938 \\
cos\_1 & 34.2147 & 26.223 & 1.305 & 0.192 \\
sin_2 & -175.5033 & 26.685 & -6.577 & 0.000 \\
cos\_2 & 0.0618 & 26.736 & 0.002 & 0.998 \\
sin\_3 & 94.0805 & 26.947 & 3.491 & 0.000 \\
cos\_3 & 59.8794 & 26.371 & 2.271 & 0.023 \\
sin\_4 & -37.4031 & 26.410 & -1.416 & 0.157 \\
cos\_4 & -158.1612 & 26.401 & -5.991 & 0.000 \\
sin\_5 & -84.8483 & 26.622 & -3.187 & 0.001 \\
cos\_5 & -2.8368 & 26.699 & -0.106 & 0.915 \\
sin\_6 & 102.8305 & 26.498 & 3.881 & 0.000 \\
cos\_6 & -53.5580 & 26.240 & -2.041 & 0.041 \\
sin\_7 & -90.8545 & 26.299 & -3.455 & 0.001 \\
cos\_7 & 3.4063 & 26.318 & 0.129 & 0.897 \\
\hline
\end{array}

Table 6. Summary of blood supply model

While the relatively lower R-squared value (0.657) of the blood supply model indicates that it is influenced by various random social effects.

3.1. Impact of the COVID-19 Pandemic

Table 7 shows that blood usage decreased by 4.25% during the COVID-19 pandemic period. This includes not only the man-made decrease in the supply due to the blood shortage but also the impact on the demand, where surgeries were reduced due to COVID- 19. Table 6 shows that blood supply also decreased by 5.11% during the same period.

3.2. Impact of Blood Shortage Periods

During blood shortage periods, blood usage decreased by 1.98% (Table 7), while blood supply increased by 3.96% (Table 6) due to conservation efforts and increased donations. This result reflects the efforts made by medical institutions to adjust blood usage and the impact of blood donation promotion campaigns in response to blood shortage situations.

\begin{array}{lcccc}
\hline
& & \textbf{Blood Usage Model Summary} & & \\
\hline
\text{R-squared} & 0.902 & \text{Adj. R-squared} & 0.901 \\
\hline
& coef & std err & t-value & \texttt{P>|t|} \\
mon & 6580.0769 & 33.335 & 197.390 & 0.000 \\
tue & 6326.6840 & 33.276 & 190.130 & 0.000 \\
wed & 6020.3690 & 33.304 & 180.771 & 0.000 \\
thu & 6017.6003 & 33.153 & 181.509 & 0.000 \\
fri & 6064.1381 & 33.312 & 182.043 & 0.000 \\
sat & 3293.3294 & 39.696 & 82.964 & 0.000 \\
sun & 2329.8185 & 39.750 & 58.612 & 0.000 \\
holiday & -2594.7261 & 58.928 & -44.032 & 0.000 \\
shortage & -103.5165 & 54.318 & -1.906 & 0.057 \\
cov\_weekday& -316.7550 & 28.952 & -10.941 & 0.000 \\
cov\_sat & -2.1619 & 64.670 & -0.033 & 0.973 \\
cov\_sun & 29.0202 & 64.272 & 0.452 & 0.652 \\
sin\_1 & -53.2172 & 17.148 & -3.103 & 0.002 \\
cos\_1 & 59.3688 & 16.708 & 3.553 & 0.000 \\
sin\_2 & -86.3912 & 17.003 & -5.081 & 0.000 \\
cos\_2 & 20.3897 & 17.035 & 1.197 & 0.231 \\
sin\_3 & 79.5388 & 17.170 & 4.633 & 0.000 \\
cos\_3 & 40.4587 & 16.802 & 2.408 & 0.016 \\
sin\_4 & -13.3373 & 16.827 & -0.793 & 0.428 \\
cos\_4 & -10.7505 & 16.822 & -0.639 & 0.523 \\
sin\_5 & -24.8776 & 16.962 & -1.467 & 0.143 \\
cos\_5 & 12.4523 & 17.012 & 0.732 & 0.464 \\
sin\_6 & 1.0620 & 16.883 & 0.063 & 0.950 \\
cos\_6 & -12.7196 & 16.719 & -0.761 & 0.447 \\
sin\_7 & -4.3845 & 16.756 & -0.262 & 0.794 \\
cos\_7 & 19.6560 & 16.769 & 1.172 & 0.241 \\
\hline
\end{array}

Table 7. Summary of blood usage model

The signs of these estimates are consistent with a previous study [1], providing evidence that the model is well-identified.

3.3. Effects of Promotions

The Korean Red Cross employs promotional methods such as additional giveaways and sending blood donation request messages to address blood shortages. Among these methods, the additional giveaway promotion was conducted uniformly across all regions over a long period, rather than as a one-time event. For this reason, the effect of this promotion was primarily analyzed. Park (2018)[18] showed the impact of promotions on blood donors, but it was based on a survey and had limitations in that quantitative changes could not be measured.

The effect of the additional giveaway promotion on the number of blood donors was confirmed by using promotion days as a dummy variable while controlling the exogenous factors considered earlier. To control the trend effect that could occur due to the clustering of promotion days (Figure 3), the entire period was divided into quarters, and the effect of promotions was measured within each quarter. To prevent outliers that could occur due to data imbalance within the quarter, only periods where the ratio of promotion days within the period ranged from 10% to 90% were used for the analysis. The Seoul, Gyeonggi, and Incheon regions were excluded from the analysis as the promotions were always carried out, making comparison impossible. Due to the nature of the dependent variable being affected by various social factors, the effect of promotions showed variance, but the mean of the distribution was estimated to be positive for all groups (Figure 4, Table 8).

In addition to the additional giveaway promotion, the Korean Red Cross conducts various special promotions. Special promotions include all promotions other than the additional giveaway promotion. As special promotions are conducted for short periods, trend effects cannot be eliminated by simply using dummy variables. Therefore, the net effect on the number of blood donors during special promotion periods was measured by comparing the number of donors during the promotion period with two weeks before and

Figure 4. Gender and region wise promotion effect

\begin{array}{llr}
\hline
\textbf{Region} & \textbf{Gender} & \textbf{Average Promotion Effect} \\
\hline
\text{Gangwon}& \text{Male} & 20.2\% \\
& \text{Female} & 16.3\% \\
\hline
\text{Gyeongnam}& \text{Male} & 14.4\% \\
& \text{Female} & 4.1\% \\
\hline
\text{Gwangju/Jeonnam}& \text{Male} & 9.6\% \\
& \text{Female} & 14.4\% \\
\hline
\text{Daegu/Gyeongbuk}& \text{Male} & 18.5\% \\
& \text{Female} & 16.8\% \\
\hline
\text{Daejeon/Sejong/Chungnam}& \text{Male} & 17.7\% \\
& \text{Female} & 12.9\% \\
\hline
\text{Busan}& \text{Male} & 19.0\% \\
& \text{Female} & 14.1\% \\
\hline
\text{Ulsan}& \text{Male} & 5.3\% \\
& \text{Female} & 14.1\% \\
\hline
\text{Jeonbuk}& \text{Male} & 9.9\% \\
& \text{Female} & 12.2\% \\
\hline
\text{Jeju}& \text{Male} & 5.5\% \\
& \text{Female} & 8.7\% \\
\hline
\text{Chungbuk}& \text{Male} & 11.0\% \\
& \text{Female} & 19.2\% \\
\hline
\end{array}

Table 8. Gender and region-wise promotion effect

after the promotion. Among various special promotions, the sports viewing ticket give- away promotion showed a high performance in several regions: Gangwon (basketball), Gwangju/Jeonnam (baseball), Ulsan (baseball), and Jeju (soccer) Table 9.

\begin{array}{l|r|r}
\hline
\textbf{Region} & \textbf{Rank of the sports promotions} & \textbf{The number of special promotions} \\
\hline
\text{Gangwon} & 1st & 4\\
\text{Gyeongnam} & N/A & 9\\
\text{Gwangju/Jeonnam} & 1st, 3rd, 4th & 6\\
\text{Daegu/Gyeongbuk} & N/A & 7\\
\text{Daejeon/Sejong/} \\ \, \text{Chungnam} & N/A & 11\\
\text{Busan} & N/A & 6\\
\text{Ulsan} & 2nd & 4\\
\text{Jeonbuk} & N/A & 6\\
\text{Jeju} & 2nd & 7\\
\text{Chungbuk} & N/A & 7\\
\hline
\end{array}

Table 9. Performance of sports viewing ticket giveaway promotions by region

4. Conclusion

Previous studies had the limitation of not being able to quantitatively measure changes in blood stock during the COVID-19 pandemic and blood shortage situations. Furthermore, the effect of blood donation promotions on the number of blood donors in Korea has not been studied.

This study quantitatively analyzed changes in supply and usage during the pandemic and blood shortage situations, as well as the impact of various promotions, using exogenous variables, including time series variables, as control variables. According to the findings of this study, a stable blood supply can be achieved by improving the low promotion response in the Ulsan-si and Jeju-do regions and implementing sports viewing ticket giveaway promotions nationwide.

This study was conducted using short-term regional grouped data due to constraints in data collection. Given that blood donation centers within a region are not homogeneous and the characteristics of blood donors change over time, future research utilizing long-term individual blood donation center data and promotion data could significantly enhance the rigor and granularity of the analysis.

References

[1] Eunhee Kim. Impact of the covid-19 pandemic on blood donation. 2023.

[2] JunSeok Yang. The relationship between attitude on blood donation and altruism of blood donors in gwangju-jeonnam area. 2019.

[3] Jihye Yang. The factor of undergraduate student’s blood donation. 2013.

[4] Eui Yeong Shin. A study on the motivations of the committed blood donors. 2021.

[5] Dong Han Lee. Segmenting blood donors by motivations and strategies for retaining the donors in each segment. 2013.

[6] Shin Kim. A study on prediction of blood donation number using multiple linear regression analysis. 2015.

[7] Republic of Korea. Blood management act, 2023.

[8] Korean Red Cross Blood Services. 2022 annual statistical report on blood services, 2022.

[9] Korean Red Cross Blood Services. Daily data for the number of blood donors, blood usage, blood stock, and promotion dates, 2023.

[10] KoreaMeteorologicalAdministration.Automatedsynopticobservingsystem,2023.

[11] Korea Astronomy and Space Science Institute. Special day information, 2023.

[12] Peter C Young, Diego J Pedregal, and Wlodek Tych. Dynamic harmonic regression. Journal of forecasting, 18:369–394, 1999.

[13] H Akaike. Information theory as an extension of the maximum likelihood principle. A ́ in: Petrov, bn and csaki, f. In Second International Symposium on Information Theory. Akademiai Kiado, Budapest, pp. 276A ́281, 1973.

[14] Su mi Lee and Sungjo Hong. The effect of weather and season on pedestrian volume in urban space. Journal of the Korea Academia-Industrial cooperation Society, 20:56–65, 2019.

[15] Republic of Korea. Framework act on the management of disasters and safety, 2023.

[16] Hye Jin Bae, Byong Sun Ahn, Mi Ae Youn, and Don Young Park. Survey on blood donation recognition and korean red cross’ response during covid-19 pandemic. The Korean Journal of Blood Transfusion, 32:191–200, 2021.

[17] Statistics Korea. Sk telecom mobile data, 2020.

[18] Seongmin Park. Effects of blood donation events on the donors’ intentions of visit in ulsan. 2018.

Data Scientific Intuition that defines Good vs. Bad scientists

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

August 5, 2024

Modified

June 9, 2025

Many amateur data scientists have little respect to math/stat behind all computational models
Math/stat contains the modelers' logic and intuition to real world data
Good data scientists are ones with excellent intuition

On SIAI's website, we can see most wannabe students go to MSc AI/Data Science program intro page and almost never visit MBA AI program pages. We have a shorter track for MSc that requires extensive pre-study, and much longer version that covers missing pre-studies. Over 90% of wannabes just take a quick scan on the shorter version and walk away. Less than 10% to the longer version, and almost nobody to the AI MBA.

We get that they are 'wannabe' data scientists with passion, motivation, and dream with self-confidence that they are the top 1%. But the reality is harsh. So far, less than 5% applicants have been able to pass the admission exam to MSc AI/Data Science's longer version. Almost never we have applicants who are ready to do the shorter one. Most, in fact, almost all students should compromise their dream and accept the reality. The fact that the admision exam is the first two courses of the AI MBA, lowest tier program, already bring students to senses that over a half of applicants usually disappear before and after the exam. Some students choose to retake the exam in the following year, but mostly end up with the same score. Then, they either criticize the school in very creative ways or walk away with frustrated faces. I am sorry for keeping such high integrity of the school.

Data Scientific Intuition that matters the most

The school focuses on two things in its education. First, we want students to understand the thought processes of data science modelers. Support Vector Machine (SVM), for example, reflects the idea that fitting can be more generalized if a separating hyperplane is bounded with inequalities, instead of fixed conditions. If one can understand that the hyperplane itself is already a generalization, it can be much easier to see through why SVM was introduced as an alternative to linear form fitting and what are the applicable cases in real life data science exercises. The very nature of this process is embedded in the school's motto, 'Rerum Cognoscere Causas' ((Felix, qui potuit rerum cognoscere causas - Wikipedia)), meaning a person pursuing the fundamental causes.

The second focus of the school is to help students where and how to apply data science tools to solve real life puzzles. We call this process as the building data scientific instuition. Often, math equations in the textbooks and code lines in one's program console screens do not have any meaning, unless it is combined in a way to solve a particular problem in a peculiar context with a specific object. Unlike many amateur data scientists' belief, coding libraries have not democratized data science to untrained students. In fact, the codes copied by the amateurs are evident examples of rookie failures that data science tools need must deeper background knowledge in statistics than simple code libraries.

Our admission exam is designed to weed out the dreamers or amateurs. After years of trials and errors, we have decided to give a full lecture of elementary math/stat course to all applicants so that we can not only offer them a fair chance but also give them a warning as realistic as our coursework. Previous schooling from other schools may help them, but the exam help us to see if one has potential to develop 'Rerum Cognoscere Causas' and data scientific intuition.

Intution does not come from hard study alone

When I first raised my voice for the importance of data scientific intution, I had had severe conflicts with amateur engineers. They thought copying one's code lines from a class (or a github page) and applying it to other places will make them as good as high paid data scientists. They thought these are nothing more than programming for websites, apps, and/or any other basic programming exercises. These amateurs never understand why you need to do 2nd-stage-least-square (2SLS) regression to remove measurement error effects for a particular data set in a specific time range, just as an example. They just load data from SQL server, add it to code library, and change input variables, time ranges, and computer resources, hoping that one combination out of many can help them to find what their bosses want (or what they can claim they did something cool). Without understanding the nature of data process, which we call 'data generating process' (DGP), their trials and errors are nothing more than higher correlation hunting like untrained sociologists do in their junk researches.

Instead of blaming one code library worse performing than other ones, true data scientists look for embedded DGP and try to build a model following intuitive logic. Every step of the model requires concreate arguments reflecting how the data was constructed and sometimes require data cleaning by variable re-structuring, carving out endogeneity with 2SLS, and/or countless model revisions.

It has been witnessed by years of education that we can help students to memorize all the necessary steps for each textbook case, but not that many students were able to extend the understanding to ones own research. In fact, the potential is well visible in the admission exam or in the early stage of the coursework. Promising students always ask why and what if. Why SVM's functional shape has $1/C$ which may limit the range of $C$ in his/her model, and what if his/her data sets with zero truncation ends up with close to 0 separating hyperplane? Once the student can see how to match equations with real cases, they can upgrade imaginative thought processes to model building logic. For other students, I am sorry but I cannot recall successful students without that ability. High grades in simple memory tests can convince us that they study hard, but lack of intuition make them no better than a textbook. With the experience, we design all our exams to measure how intuitive students are.

Intuition that frees a data scientist

In my Machine Learning class for tree models, I always emphasize that a variable with multiple disconnected effective ranges in trees has a different spanned space from linear/non-linear regressions. One variable that is important in a tree space, for example, may not display strong tendency in linear vector spaces. A drug that is only effective to certain age/gender groups (say 5~15, 60~ for male, 20~45 female) can be a good example. Linear regression hardly will capture the same efffective range. After the class, most students understand that relying on Variable Importances of tree models may conflict with p-value type variable selections in regression-based models. But only students with intuition find a way to combine both models that they find the effective range of variables from the tree and redesign the regression model with 0/1 signal variables to separate the effective range.

The extend of these types of thought process is hardly visible from ordinary and disqualified students. Ordinary ones may have capacity to discern what is good, but they often have hard time to apply new findings to one's own. Disqualified students do not even see why that was a neat trick to the better exploitation of DGP.

What's surprising is that previous math/stat education mattered the least. It was more about how logical they are, how hard-working they are, and how intuitive they are. Many students come with the first two, but hardly the third. We help them to build the third muscle, while strenghtening the first. (No one but you can help the second.)

The re-trying students ending up with the same grades in the admission exam are largely because they fail to embody the intuition. It may take years to develop the third muscle. Some students are smart enough to see the value of intuition almost right away. Others may never find that. For failing students, as much as we feel sorry for them, we think that their undergraduate education did not help them to build the muscle, and they were unable to build it by themselves.

The less chanllenging tier programs are designed in a way to help the unlucky ones, if they want to make up the missing pieces from their undergraduate coursework. Blue pills only make you live in fake reality. We just hope our red pill to help you find the bitter but rewarding reality.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

AI Pessimism, just another correction of exorbitant optimism

Picture

Member for

8 months 1 week

Real name

Ethan McGowan

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Published

August 1, 2024

Modified

June 9, 2025

AI talks turned the table and become more pessimistic
It is just another correction of exorbitant optimism and realisation of AI's current capabilities
AI can only help us to replace jobs in low noise data
Jobs needing to find new patterns and from high noise data industry, mostly paid more, will not be replaceable by current AI

There have been pessimistic talks about the future of AI recently that have created sudden drops in BigTech firms' stock prices. In all of a sudden, all pessimistic talks from Investors, experts, and academics in reputed institutions are re-visited and re-evaluated. They claim that ROI (Return on Investment) for AI is too low, AI products are too over-priced, and economic impact by AI is minimal. In fact, many of us have raised our voices for years with the exactly same warnings. 'AI is not a magic wand'. 'It is just correlation but not causality / intelligence'. 'Don't be overly enthusiastic about what a simple automation algorithm can do'.

As an institution with AI in our name, we often receive emails from a bunch of 'dreamers' that they wonder if we can make a predictive algorithm that can foretell stock price movements with 99.99% accuracy. If we could do that, why do you think we would share the algorithm with you? We should probably keep it for secret and make billions of dollars just for ourselves. As much as the famous expression by Milton Friedman, a Nobel economist, there is no such thing as a free lunch. If we have a perfect predictability and it is widely public, then the prediction is no longer a prediction. If everyone knows the stock A's price goes up, then everyone would buy the stock A, until it reaches to the predicted value. Knowing that, the price will jump to the predicted value, almost instantly. In other words, the future becomes today, and no one gets benefited.

AI = God? AI = A machine for pattern matching

A lot of enthusiasts have exorbitant optimism that AI can overwhelm human cognitivie capacity and soon become god-like feature. Well, the current forms of AI, be it Machine Learning, Deep Learning, and Generative AI, are no more than a machine for pattern matching. You touch a hot pot, you get a burn. It is painful experience, but you learn that you should not touch when it is hot. The worse the pain, the more careful you become. Hopefully it does not make your skin irrecoverable. The exact same pattern works for what they call AI. If you apply the learning processes dynamically, that's where Generative AI comes. The system is constantly adding more patterns into the database.

Though the extensive size of patterns does have great potential, it does not mean that the machine has cognitive capacity to understand the pattern's causality and/or to find new breakthrough patterns from list of patterns in the database. As long as it is nothing more than a pattern matching system, it never will.

To give you an example, can it be used what words you are expected to answer in a class that has been repeated for thousand times? Definitely. Then, can you use the same machine to predict the stock price? Aren't the stock market repeating the same behavior over a century? Well, unfortunately it is not, thus you can't be benefited by the same machine for financial investments.

Two types of data - Low noise vs. High noise

On and near the Wall Street, you can sometimes meet an excessively confident hedge fund manager with claims on near perfect foresight for financial market movements. Some of them have outstanding track records, and surprisingly persuasive. In New York Times archive back in 1940s, or even as early as 1910s, you can see people with similar claims were eventually sued by investors, arrested due to false claims, and/or just disappeared from the street within a few years. If they were that good, why then they lost money and got sued/arrested?

There are two types of data. One set of data that you can see from machine (or highly controlled environment) is called 'Low-noise' data. It has high predictability. Even in cases where embedded patterns are invisible by bare eyes, you either need more analytic brain or a machine to test all possibilities within the possible sets. For the game of Go, the brain was Se-dol Lee and the machine was Alpha-Go. The game needs to test 19x19 possible sets with around 300 possible steps. Even if your brain is not as good as Se-dol Lee, as long as your computer can find the winning patterns, you can win. This is what has been witnessed.

The other set of data comes from largely uncontrolled environment. There potentially is a pattern, but it is not the single impetus that drives every motion of the space. There are thousands, if not millions, of patterns that the driver is not observable. This is where randomness is needed for modeling, and it is unfortunately impossible to predict accurate move, because the driver is not observable. We call this set of data 'High-noise'. The stock market is the very example of such. There are millions of unknown, unexpectable, and at least unmeasurable influences that disable any analyst or machine to predict with accuracy level upto 100%. This is why financial models are not researched for predictability but used only to backtest financial derivatives for reasonable pricing.

Natural language process (NLP) is one example of low noise. Our language follows a certain set of rules (or patterns), which are called grammar. Unless you are uneducated or intentionally out of grammar (or make mistakes), people generally follow grammar. Weather is mostly low noise, but it has high noise components. Sometimes typhoons are unpredictable, or less predictable. Stock market? Be my guest. There have been 4 Nobel Prizes given to financial economists by year 2023, and all of them are based on the belief that stock markets follow random processes, be it Gaussian, Poisson, and/or any other unknown random distributions. (Just in case, if a process follows any known distribution, that means it is probabilistic, which means it is random.)

Potential benefits of AI

We as an institution hardly believe current forms of AI will make any significant changes in businesses and our life in short term. The best we can expect is automation of mundane tasks. Like laundary machine in early 20th century. ChatGPT already has shown us a path. Soon, CS operators will largely be replaced by LLM based chatbots. US companies actively outsourced the function from India for the past a few decades, thanks to cheaper international connectivity via internet. It will still remain, but human actions will be needed way less than before. In fact, we already get machine generated answers from a number of international services. If we complain about a program's malfunction on a WordPress plugin, for instance, established services email us machine answers first. For a few cases, it actually is enough. The practice will become more popular to less-established services as it becomes easier and cheaper to implement.

Teamed up with EduTimes, we also are working on a research to replace 'Copy Boys/Girls'. Journalists that we know from large news magazines are not always running on the street to find new and fascinating stories. In fact, most of them read other newspapers and rewrite the contents as if they were the original sources. Although it is not an important job, it is still needed for the newspaper to run. They need to keep up the current events, accoring to the EduTimes journalists from other renouned newspapers. The copy team is usually paid the least and seen a death sentence as a journalist. What makes the job more sympathetic on top of the least respect, it will soon be replaced by LLM based copywriters.

In fact, any job that generates patterned contents without much of cognitivie functions will gradually be replaced.

What about automotive driving? Is it a low-noise pattern job or a high-noise complicated cognitive job? Well, although Elon Musk claims high possibility of Lv. 4 auto-driving within next a few years, we don't believe so. None of us at GIAI have seen any game theorists have solved multi-agent ($n$>2) Bayesian belief game with imperfect information and unknown agent types by computer so that the automotive driving algorithm can predict what other drivers on the road will do. Without the right prediction of others on the fast moving vehicles, it is hard to tell if your AI will help you successfully avoid other crazy drivers. The driving job for those eventful cases needs 'instinct', which requires another set of bodily function different from cognitive intelligence. The best that the current algorithm can do is to tighten it up to perfection for a single car, which already needs to go over a lot of mathematical, mechanical, organisational, legal, and commercial (and many more) challenges.

Don't they know all that? Aren't the Wall Street investors self-confident, egocentric, but ultra smart that they already know all the limitations of AI? We believe so. At least we hope so. Then, why do they pay attention to the discontentful pessimism now, and create heavy drops in tech stock prices?

Guess the Wall Street hates to see Silicon Valley to be paid too much. American East often think the West too unrealistic and floating in the air. OpenAI's next round funding may surprise us in a totally opposite direction.

Picture

Member for

8 months 1 week

Real name

Ethan McGowan

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

MSc AI/Data Science vs. Boot Camp for AI

Picture

Member for

8 months 1 week

Real name

David O'Neill

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Published

July 29, 2024

Modified

June 9, 2025

Boot camp is for software programming without mathematical training
MSc is a track for PhD, with in-depth scientific research written in the language of math and stat
We respect programmers, but our works are significantly varying

Due to the fact that we are running SIAI, an higher educational institution for AI/Data Science, we often have questions about the difference between Boot Camps for AI and MSc programmes. The shortest answer is the difference in Math requirements. Masters track is for people looking for academic training so that one can read academic papers in that subject. With PhD in the topic, we expect the student to be able to lead a research. From Boot Camp, sorry to be a little aggressive here, but we only expect a 'Coding Monkey'.

We are aware that many countries are shallow in AI/Data Science that they want employees only to be able to best use of Open AI's and AWS's libraries by Rest API. For that, boot camp should be enough, unless the boot camp teacher does not know how to do so. There are nearly infinite amount of contents for how to use Rest API for your software, regardless of your backend platform, be it an easy script languages like Python or tough functional ones like OCaml. Difficulties are not always indicators of determinants in challenges, and we, as data scientists at GIAI, care less about what language you use. What's important is how flexible your thinking for mathematically contained modeling.

Boot camp for software programing, MSc for scientific training

Unfortunately, unless you are lucky enough to be born as smart as Mr. Ramanujan, you cannot learn math modeling skills from a bunch of blogs. Programming, however, has infinitely many proven records of excellent programmers without school traninng. Elon Musk is just one example. He did Economics and Physics in his undergrad at U Penn, and he only stayed one day in the mechanical engineering PhD program at Stanford University. Programming is nothing more than a logic, but math needs too many building blocks to understand the language.

When we first build SIAI, we had quite a lengthy discussion for weeks. Keith was firm that we should stick to mathematical aspects of AI/Data Science. (which doesn't mean we should only teach math, just to avoid any misunderstanding.) Mc wanted two tier tracks for math and coding. We later found that with coding, it is unlikely that we can have the school accreditted by official parties, so we end up with Keith's idea. Besides, we have seen too many Boot Camps around the world that we do not believe we can be competitive in that regard.

The founding motto of the school is 'Rerum Cognoscere Causaus', meaning 'the real cause of things'. With mathematical tools, we were sure that we can teach what are the reason behind a computational model was first introduced. Indeed, Keith has done so well in his Scientific Programming that most students no longer bound to media brainwashing that Neural Network is the most superior model.

Scientists do our own stuff

If you just go through Boot Camps for coding, chances are that you can learn the limitations of Neural Network just by endless trials and errors, if not somebody's Medium posts and Reddit comments. In other words, without the proper math training, it is unlikely one can understand how the computational logics of each model are built, which makes us to aloof from all programmers without necessary math training.

The very idea comes from multiple rounds of uneasy exposures to software engineers without a shred of understanding in modeling side of AI. They usually claim that Neural Network is proven to be the best model, and they do not need any other model knowledge. And all they have to do is to run and test it. Researchers at GIAI are trained scientists, and we mostly can guess what will happen just by looking at equations. And, most importantly, we are well aware that NN is the best model only for certain tasks.

They kept claim that they were like us, and some of them wanted to build a formal assocation with SIAI (and later GIAI). It's hard for us to work with them, if they keep that attitude. These days, whenever we are approached by third parties, if they want to be at equals with us, we ask them to show us math training levels. Please make no mistake that we respect them as software engineers, but we do not respect them as scientists.

Guess aforementioned story and internal discomfort tells you the difference between software engineers and data/research scientists, let alone tools that we rely on.

We screen out students by admission exams in math/stat

With the experience, Keith initiated two admission exams for our MSc AI/Data Science programmes. At the very beginning, we thought there will be plenty of qualifying students, so we used final year undergrad materials. There was a disaster. We gave them two months of dedicated training. Provided similar exams and solved each one of them with extra detail. But, only 2 out of 30 students were able to get grades good enough to be admitted.

We lowered the level down to European 2nd year (perhaps American 3rd year), and the outcome wasn't that different. Students were barely able to grasp superficial concepts of key math/stat. This is why we were kinda forced to create an MBA program that covers European 2nd year teaching materials with ample amount of business application cases. With that, students survive, but answer keys in their final exam tell us that many of them belong to coding Boot Camps, not SIAI.

From year 2025 and onwards, we will have one admission exam for MSc AI/Data Science (2 year) in March, after 2 months pre-training in Jan and Feb. The exam materials will be 2nd year undergrad level. If a student passes, we offer an exam with one notch up in June, again after 2 months pre-training in Apr and May. This will give them MSc AI/Data Science (1 year) admission.

Students who failed the 2-year track admission, we offer them MBA AI program admission, which covers some part of the 2-year track courses. If they think they are ready, then in the following year, they can take the admission exam again. After a year of various courework, some students have shown better performance, based on our statistics, but not by much. It seemed like the brain has its limit that they cannot go above.

Precisely by the same reason, we are reasonably sure that not that many applicants will be able to come to 2-year track, and almost no one for the 1-year track. More details are available from below link:

Program comparison - Swiss Institute of Artificial Intelligence (siai.org)

Picture

Member for

8 months 1 week

Real name

David O'Neill

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Member for

Input

Changed

Member for

Similar Post

Member for

Input

Changed

Member for

Similar Post

Member for

Input

Changed

Member for

Similar Post

All News

Financial

Tech

Policy

Bio&Science

All Economy Books

Published

1. Introduction

1.1. Background

1.2. Problem Statement

1.3. Aims and Objectives

1.4. Summary of Contributions and Achievements

2. Solution Approach

2.1. Best Place to Ride a Seoul Bike

2.2. The Cause of The Public Bike Projects' Deficit

2.3. Time of Day Usage Patterns of Seoul Bike Users

2.4. Summary

3. Methodology

3.1. Correlation of 24-hour weather data with rental demand

3.2. Seoul Bike Daily Return Pattern

3.3. Modeling of Seoul bike cumulative volume during commuting hours

3.4. Bike Demand Forecasting by using SARIMAX Modeling

3.5. Logic for improving the efficiency of bicycle relocation

3.6. Seoul Bike cost efficiency solution according to 1-Day Index Range

3.7. Seoul Bike operation plan idea using spatial equilibrium

4. Results

4.1. Cumulative number of Seoul Bike rentals in Gangseo-gu according to D Index

4.2. Cumulative number of Seoul Bike rentals in Seoul according to D Index

4.3. Results of spatial equilibrium implementation through application of Louvain algorithm

References

Published

I. Introduction

II. Literature Review

III. Model

A. Setup

B. Production and Capital Allocation

C. Government Objectives and Tax Rates

D. Nash Equilibrium Analysis

IV. Simulations and Results

A. Net Capital Positions and Tax Competition Dynamics

B. Public Good Provision and Welfare Implications

C. The Role of the Overlapping Jurisdiction

V. Conclusion

References

APPENDIX 1 - DERIVING REACTION FUNCTIONS

APPENDIX 2 - DERIVING NASH EQUILIBRIUM

Published

1. Introduction

2. Methodology

2.1. Research Subjects

2.2. Data Sources

2.3. Variable Definitions

2.4. Variable Composition

2.5. Research Method

3. Result and Discussion

3.1. Impact of the COVID-19 Pandemic

3.2. Impact of Blood Shortage Periods

3.3. Effects of Promotions

4. Conclusion

References

Member for

Published

Modified

Data Scientific Intuition that matters the most

Intution does not come from hard study alone