Feed SQ | The Economy

Why Companies cannot keep the top-tier data scientists / Research Scientists?

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

July 29, 2024

Modified

June 9, 2025

Top brains in AI/Data Science are driven to challenging jobs like modeling
Seldom a 2nd-tier company, with countless malpractices, can meet the expectations
Even with $$$, still they soon are forced out of AI game

A few years ago, a large Asian conglomerate acquired a Silicon Valley's start-up just off an early Series A funding. Let's say it is start-up $\alpha$. The M&A team leader later told me that the acquisition was mostly to hire the data scientist in the early stage start-up, but the guy left $\alpha$ on the day the M&A deal was announced.

I had an occation to sit down with the data scientist a few months later, and asked him why. He tried to avoide the conversation, but it was clear that the changing circumstances definitely were not within his expectation. Unlike other bunch of junior data scientists in Silicon Valley's large firms, he did signal me his grad school training in math and stat that I had a pleasant half an hour talk about models. He was mal-treated in large firms that he was given to run SQL queries and build Tableau-based graphes, like other juniors. His PhD training was useless in large firms, so he had decided to be a founding member of $\alpha$ that he can build models and test them with live data. The Asian acquirer with bureaucratic HR system wanted him to give up his agenda and to transplant the Silicon Valley large firm's junior data scientist training system to the acquirer firm.

Brains go for brains

Given tons of other available positions, he didn't waste his time. Personall,y I also have lost some months of my life for mere SQL queries and fancy graphes. Well, some people may still go for 'data scientist' title, but I am my own man. So was the data scientist from $\alpha$.

These days, Silicon Valley firms call the modelers as 'research scientists', or simliar names. There also are positions called 'machine learning engineers' whose jobs somewhat related to 'research scientists', but may disinclude mathematical modeling parts and way more software engineering in it. The title 'Data Scientists' are now given to jobs that were used to be called 'SQL monkeys'. As the old nickname suggests, not that many trained scientists would love to do the job, even with competitive salary package.

What companies have to understand is that we, research scientists, are not trained for SQL and Tableau, but mathematical modeling. It's like a hard-trained sushi cook(将太の寿司, shota no sushi) is given to make street food like Chinese noodle.

Let me give you an example in real corporate world. Let's say a semi-conductor company, $\beta$ wants to build a test model for a wafer / subsctrate. What I often hear from those companeis are that they build a CNN model that reads the wafer's image and match it with pre-labeled 0/1 for error detection. In fact, similar practices have been widely adapted practice among all Neural Network maniacs. I am not saying it does not work. It works. But then, what would you do, if the pre-label was done poorly? Say, the 0/1 entries were like over 10,000 and hardly any body double checked the accruracy. Can you rely on that CNN-based model? In addition to that, the model probably require enourmous amount of computational costs to build, let alone test and operating it daily.

Wrong practice that drives out brain

Instead of the costly and less scientific option, we can always build a model that captures data's generated process(DGP). The wafer is composed of $n \times k$ entries, and issues emerge when $n \times 1$ or $1 \times k$ entries go wrong altogether. Given the domain knowledge, one can build a model with cross-products between entries in the same row/column. If it is continuously 1 (assume 1 for error), then it can easily be identified as a defect case.

Cost of building a model like that? It just needs your brain. There is a good chance that you don't even need a dedicated graphics card for that calculation. Maintenance costs are also incomparably smaller than the CNN version. The concept of computational cst is something that you were supposed to learn in any scientific programming classes at school.

For companies sticking to the expensive CNN options, I always can spot followings:

The management has little to no sense of 'computational cost'
The manaement cannnot discern 'research scientists' and 'machine learning engineers'
The company is full of engineers without the sense of mathematical modeling

If you want to grow up as a 'research scientist', just like the guy at $\alpha$, then run. If you are smart enough, you must have already run, like the guy at $\alpha$. After all, this is why many 2nd-tier firms end up with CNN maniacs like $\beta$. Most 2nd-tier firms are unlucky that they cannot keep research scientists due to lack of knowledge and experience. Those companies have to spend years of time and millions of wasted dollars to find that they were so long. By the time that they come to senses, it is mostly already way too late. If you are good enough, don't waste your time on a sinking ship. The management needs so-called cold-turkey type shock treatment as a solution. In fact, there was a start-up that I stayed only for a week, which lost at least one data scientist in everyweek. The company went to bankrupt in 2 years.

What to do and not to do

At SIAI, I place Scientific Programming right after elementary math/stat training. Students see that each calculation method is an invention to overcome earlier available options' limitations but simultanesouly the modification bounds the new tactic in another directions. Neural Networks are just one of the many kinds. Even with the eye-opening experience, some students still remain NN maniacs, and they flunk in Machine Learning and Deep Learning classes. Those students believe that there must exist a grand model that is univerally superior to all other models. I wish the world is that simple, but my ML and DL courses break the very belief. Those who are awaken, usually become excellent data/research scientists. Many of them come back to me that they were able to minimize computational costs by 90% just by replacing blindly implemented Neural Network models.

Once they see that dramatic cost reduction, at least some people understand that the earlier practice was wrong. The smarty student may not be happy to suffer from poor management and NN maniacs for long. Just like the guy at $\alpha$, it is always easier to change your job than fighting to change your incapable management. Managers moving fast maybe able to withhold the smarty. If not, you are just like the $\beta$. You invest a big chunk of money for an M&A just to hire a smarty, but the smarty disappears.

So, if you want to keep the smarty? Your solution is dead simple. Test math/stat training levels in scientific programming. You will save tons of $$$ in graphic card purchase.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

ChatGPT to replace not (intelligent) jobs but (boring) tasks

Picture

Member for

8 months 1 week

Real name

Ethan McGowan

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Published

July 17, 2024

Modified

June 9, 2025

ChatGPT is to replace not jobs but tedious tasks
For newspapers, 'rewrite man' will soon be gone
For other jobs, the 'boring' parts will be replaced by AI, 
but not the intellectual and challenging parts

There has been over a year of hype for Large Language Models(LLMs). At the onset and initial round of hype, people outside of this field asked me if their jobs were to be replaced by robots. By now, over a year of trials with ChatGPT, they finally seem to understand that it is nothing more than an advanced chatbot that still is unable to stop generating 'bullshit', according to Noam Chomsky, an American professor and public intellectual known for his work in linguistics and social criticism.

As my team at GIAI predicted in early 2023, all LLM trials will be able to replace some jobs, but most jobs that will be replaced will be simple mundane tasks. That's because these language models are meant to find higher correlation between text/image groups, but still unable to 'intelligently' find logical connection between thoughts. In statistics, it is called high correlation with no causality, or simply 'spurious relations'.

LLMs will replace 'copying boys/girls'

When we were first approached by EduTimes back in early 2022, they thought we could create an AI machine to replace writers and reporters. We told them the best we can create is to replace a few boring desk jobs like 'rewrite man'. The job that requires to rewrite what other newspapers have already reported. 'Copy boy' is one well-known disparaging term for that job. Most large national magazines have such employees, just to keep their magazines to be up-to-dated with recent news.

Since none of us at GIAI are from journalism, and EduTimes is far from a large national magazine, we are not aware of exact proportion of 'rewrite man' in large magazines, let alone how many articles are re-written by them. But based on what we see from magazines, we can safely argue that at least 60~80% articles are probably written by the 'copy boys/girls'. Some of them are at the high risk of plagiarism. This is one sad reality of journalism industry, accoring to the EduTimes team.

The LLM that we are working on, GLM(GIAI's Language Model), isn't that different from other competitors in the market that we also have to rely on text bodies' correlations, or more precisely 'associations' by the association rules in machine learning textbooks. Likewise, we also have lots of inconsistency problems. To avoid the Noam Chomsky's famous accusation, 'LLMs are bullshit generators', the best any data scientist can do is just to set a high cut-off in support, confidence, and lift. Beyond that, it is not the job of data models, which includes all AI variants for pattern recognition.

But still correlation does not necessarily mean causality

The reason we see infinitely many 'bullshit' cases is because the LLM services still belong to statistics, a discipline to find not causality but correlation.

If high correlation can be translated to high causality, there has to be one important condition satisfied. The data set contains all coherent information so that high correlation naturally means high causality. This actually is where we need the EduTimes. We need clean, high quality, and topic-specific data.

After all, this is why OpenAI is willing to pay for data from Reddit.com, a community with intense and quality discussions. LLM service providers are in negotiation with U.S. top newspapers precisely the same reason. Although it does not mean that coherent and quality news articles will give us 100% guarantee in correlation to causality, at least we can establish a claim that disturbing cases will largely be gone without time-consuming technical optimization.

By the same logic, jobs that can be replaced by LLMs or any other AIs with pattern matching algorithms are the ones that have strong and repeating patterns that does not require logical connections.

AI can replace not (intelligent) jobs but (boring) tasks

As we often joke around at GIAI, technologies are bounded by mathematical limitations. Unfortunately, we are not John von Neumann who can solve every impossible mathematical challenges as easy as college problem sets. Thanks to computational breakthroughs, we are already at the level far from what we expected 10 years ago. Back then, we did not expect to extract corpora from 10 books in a few minites. If anything, we thought it needed weeks of supercomputer resources. It is not anymore. But even with surprising speed of computational achievements, we are still bound to mathematical limits. As said, correlation without causality is 'bullshit'.

With the current mathematical limitations, we can say

AI can replace not (intelligent) jobs but (super mega ultra boring) tasks

And, the replaceable tasks are boring, tedious, repetitive, and patterned tasks. So, please stop worrying about losing jobs, if yours torture your brain to think. Instead, plz think about how to use LLMs like automation to lighten your burden from mundane tasks. It will be like your mom's laundary machine and dish washer. Younger generation females no longer are bound to housekeeping. They go out to work places and fight for the positions that meet their dreams, desires, and wants.

Picture

Member for

8 months 1 week

Real name

Ethan McGowan

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Post hoc, ergo propter hoc - impossible challenges in finding causality in data science

Picture

Member for

8 months 1 week

Real name

David O'Neill

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Published

June 14, 2024

Modified

June 9, 2025

Data Science can find correlation but not causality
In stat, no causal but high correlation is called 'Spurious regression'
Hallucinations in LLMs are repsentative examples of spurious correlation

Imagine two twin kids living in the neighborhood. One prefers to play outside day and night, while the other mostly sticks to his video games. After a year later, doctors find that the gamer boy is much healthier, thus conclude that playing outside is bad for growing children's health.

What do you think of the conclusion? Do you agree with the conclusion?

Even without much scientific training, we can almost immediately dismiss the conclusion that is based on lop-sided logic and possibly driven by insufficient information of the neighborhood. For example, if the neighborhood is as radioactively contained as Chernobyl or Fukushima, playing outside can undoubtedly be as close as committing a suicide. What about more nutrition provided to the gamer boy due to easier access to home food? The gamer body just had to drop the game console for 5 seconds to eat something, but his twin had to walk or run for 5 minites to come back home for food.

In fact, there are infinitely many potential variables that may have affected two twin kids' condition. Just by the collected data set above, the best we can tell is that for an unknown reason, the gamer boy is medically healthier than the other twin.

In more scientific terms, it can be said that statistics has been known for correlations but not for causality. Even in a controlled environment, it is hard to argue that the control variable was the cause of the effect. Researchers only 'guess' that the correlation means causality.

Post Hoc, Ergo Propter Hoc

There is a famous Latin phrase meaning "after this, therefore on account of it". In plain English, it means that one event is the cause of the other event occuring right next. You do not need rocket science to counterargue that two random events are interconnected just because one occured right after another. This is a widely common logical mistake that assigns causality just by an order of events.

In statistics, it is often called that 'Correlation does not necessarily guarantee causality'. In the same context, such a regression is called 'Spurious regression', which has been widely reported in engineers' adaptation of data science.

One noticeable example is 'Hallucination' cases in ChatGPT. The LLM only finds higher correlation between two words, two sentences, and two body of texts (or images in these days), but it fails to discern the causal relation embedded in the two data sets.

Statistians have long been working on to differentiate the causal cases from high correlation, but the best so far we have is 'Granger causallity', which only helps us to find no causality case between 3 variables. Granger causality offers a philophical frame that can help us to test if the 3rd variable can be a potential cause of the hidden causality. The academic countribution by Professor Granger's research to be Nobel Prize awarded is because it proved that it is mechanically (or philosophically) impossible to verify a causal relationship just by correlation.

Why AI ultimately needs human approval?

The Post Hoc Fallacy, by nature of current AI models, is an unavoidable huddle that all data scientists have to suffer from. Unlike simple regression based researches, the LLMs rely on too large chunk of data that it is practically impossible to tackle every connection of two text bodies.

This is where human approval is required, unless the data scientists decide to finetune the LLM in a way to offer only the highest probable (thus causal) matches. The more likely the matches are, the less likely there will be spurious connection between two sets of information, assuming that the underlying data is sourced from accurate providers.

Teaching AI/Data science, I surprisingly often come across a number of 'fake experts' whose only understanding of AI is a bunch of terminology from newspapers, or a few lines of media articles at best, without any in-depth training in basic academic tools, math and stat. When I raise Grange causality as my counterargument for impossibility to distinguish from correlation to causality by statistical methods alone (by far philosophically impossible), many of them ask, "Then, wouldn't it be possible with AI?"

If the 'fake experts' had some elementary level math and stat training from undergrad, I believe they should be able to understand that computational science (academic name of AI) is just a computer version of statistics. AI is actually nothing more than the task of performing statistics more quickly and effectively using computer calculations. In other words, AI is a sub-field of statistics. Their questions can be framed like

If it is impossible with statistics, wouldn’t it be possible with statistics calculated by computers?
If it is impossible with elementary arithmetic, wouldn't it be possible with addition and subtraction?

The inability of statistics to make causal inferences is the same as saying that it is impossible to mechanically eliminate hallucinations in ChatGPT. Those with academic training in the fields social sciences, the disciplines of which collect potentially correlated variables and use human experience as the final step to conclude causal relationships, see that ChatGPT is built to mimic cognitive behavior at the shamefully shallow level. The fact that ChatGPT depends on 'Human Feedback' in its custom version of 'Reinforcement Learning' is the very example of the basic cognitive behavior. The reason that we still cannot call it 'AI' is because there is no automatic rule for the cheap copy to remove the Post Hoc Fallacy, just like Clive Granger proved in his work for Nobel Prize.

Causal inference is not monotonically increasing challenge, but multi-dimensional problem

In natural science and engineering, where all conditions are limited and controlled in the lab (or by a machine), I often see cases where they see human correction as unscientific. Is human intervention really unscientific? Well, Heidelberg's indeterminacy principle states that when a human applies a stimulus to observe a microscopic phenomenon, the position and state just before applying the stimulus can be known, but the position after the stimulus can only be guessed. If no stimulation is applied at all, the current location and condition cannot be fully identified. In the end, human intervention is needed to earn at least partial information. Withou it, one can never have any scientifically proven information.

Computational science is not much different. In order to rule out hallucinations, researchers either have to change data sets or re-parameter the model. The new model may be closer to perfection for that particular purpose, but the modification may surface hidden or unknown problems. The vector space spanned by the body of data set is too large and too multidimensional that there is no guarantee that one modification will always monotonically increase the perfection in every angle.

What is more concerning is that the data set is clean, unless you are dealing with low noise (or zero noise) ones like grammatically correct texts and quality images. Once researchers step aside from natural language and image recognition, data sets are exposed to infinitely many sources of unknown noises. Such high noise data often have measurement error problems. Sometimes researchers are unable to collect important variables. These are called 'endongeneity', and social scientists have spent nearly a century to extract at least partial information from the faulty data.

Social scientists have modified statistics in their own way that complements 'endogeneity'. Econometrics is a representative example, using the concept of instrumental variables to eliminate problems such as errors in variable measurement, omission of measured variables, and two-way influence between explanatory variables and dependent variables. These studies are coined 'Average Treatment Effect' and 'Local Average Treatment Effect' that were awarded the Nobel Prize in 2021. It's not completely correct, but it's part of the challenge to find a little less wrong.

Some untrained engineers claim magic with AI

Here at GIAI, many of us share our frustrations with untrained engineers confused AI as a marketing term for limited automatization with real self-evolving 'intelligence'. The silly claiming that one can find causality from correlation is not that different. The fact that they claim such spoofing arguments already proves that they are unaware of Granger's causality or any philosophically robust proposition to connect/disconnect causality and correlation, thus proves that they lack scientific training to handle statistical tools. Given that current version of AI is no better than pattern matching for higher frequency, it is no doubt that scientifically untrained data scientists are not entitled to be called data scientists.

Let me share one bizarre case that I heard from a colleague here at GIAI from his country. In case anyone feel that the following example is a little insulting, a half of his jokes are about his country's inable data scientists. In one of the tech companies in his country, a data scientist was given to differentiate a handful of causal events from a bunch of correlation cases. The guy said "I asked ChatGPT, but it seems there were limitations because my GPT version is 3.5. I should be able to get a better answer if I use 4.0."

The guy not only is unaware of the post hoc fallacy in data science, but he also highly likely does not even understand that ChatGPT is no more than a correlation machine for texts and images by given prompts. This is not something you can learn from job. This is something you should learn from school, which is precisely why many Asian engineers are driven to the misconception that AI is magic. It has been known that Asian engineering programs generally focus less on mathematical backup, unlike renowned western universities.

In fact, it is not his country alone. The crowding out effect is heavy as you go to more engineer driven conferences and less sophisticated countries / companies. Despite the shocking inability, given the market hype for Generative AI, I guess those guys are paid high. Whenever I come across mockeries like the untrained engineers and buffoonish conferences, I just laugh it off and shake it off. But, when it comes to businesses, I cannot help ask myself if they worth the money.

Picture

Member for

8 months 1 week

Real name

David O'Neill

Bio

Founding member of GIAI & SIAI
Professor of Data Science @ GSB

Why is STEM so hard? why high dropOut?

Picture

Member for

8 months 1 week

Real name

Catherine Maguire

Bio

Founding member of GIAI
Professor of Data Science @ GSB

Published

June 11, 2024

Modified

June 9, 2025

STEM majors are known for high dropouts
Students need to have more information before jumping into STEM
Admission exam and tiered education can work, if designed right

Over the years of study and teaching in the fields of STEM(Science, Technology, Engineering, and Mathematics), it is not uncommon to see students disappearing from the program. They often are found in a different program, or sometimes they just leave the school. There isn't commonly shared number of dropout rate across the countries, universities, and specific STEM disciplines, but it has been witnessed that there is a general tendancy that more difficult course materials drive more students out. Math and Physics usually lose the most students, and graduate schools lose way more students than undergraduate programs.

At the onset of SIAI, though there has been growing concerns that we should set admission bar high, we have come to agree with the idea that we should give chances to students. Unlike other universities with somewhat strict quota assigned to each program, due to size of classrooms, number of professors, and etc., since we provide everything online, we thought we are limitless, or at least we can extend the limit.

After years of teaching, we come to agree on the fact that rarely students are ready to study STEM topics. Most students have been exposed to wrong education in college, or even in high school. We had to brainwash them to find the right track in using math and statistics for scientific studies. Many students are not that determined, neither. They give up in the middle of the study.

With stacked experience, we can now argue that the high dropout rate in STEM fields can be attributed to a variety of factors, and it's not solely due to either a high number of unqualified students or the difficulty of the classes. Here are some key factors that can contribute to the high dropout rate in STEM fields:

High Difficulty of Classes: STEM subjects are often challenging and require strong analytical and problem-solving skills. The rigor of STEM coursework can be a significant factor in why some students may struggle or ultimately decide to drop out.
Lack of Preparation: Some students may enter STEM programs without sufficient preparation in foundational subjects like math and science. This lack of preparation can make it difficult for students to keep up with the coursework and may lead to dropout.
Lack of Support: Students in STEM fields may face a lack of support, such as inadequate mentoring, tutoring, or academic advising. Without the necessary support systems in place, students may feel isolated or overwhelmed, contributing to higher dropout rates.
Perceived Lack of Relevance or Interest: Some students may find that the material covered in STEM classes does not align with their interests or career goals. This lack of perceived relevance can lead to disengagement and ultimately dropout.
Diversity and Inclusion Issues: STEM fields have historically struggled with diversity and inclusion. Students from underrepresented groups may face additional barriers, such as lack of role models, stereotype threat, or feelings of isolation, which can contribute to higher dropout rates.
Workload and Stress: The demanding workload and high levels of stress associated with STEM programs can also be factors that lead students to drop out. Balancing coursework, research, and other commitments can be overwhelming for some students.
Career Prospects and Job Satisfaction: Some students may become disillusioned with the career prospects in STEM fields or may find that the actual work does not align with their expectations, leading them to reconsider their career path and potentially drop out.

It's important to note that the reasons for high dropout rates in STEM fields are multifaceted and can vary among individuals and institutions. Addressing these challenges requires a holistic approach that includes providing academic support, fostering a sense of belonging, promoting diversity and inclusion, and helping students explore their interests and career goals within STEM fields.

Not just for the gifted bright kids

Given what we have witnessed so far, at SIAI, we have changed our admission policy quite dramatically. The most important of all changes is that we have admission exams and courses for exams.

Although it sounds a little paradoxical that students come to the program to study for exam, not vice versa, we come to an understanding that our customized exam can greatly help us to find true potentials of each student. The only problem of the admission exam is that the exam mostly knocks off students by the front. We thus offer classes to help students to be prepared.

This is actually a beauty of online education. We are not bounded to location and time. Students can go over the prep materials at their own schedule.

So far, we are content with this option because of following reasons:

Self-motivation: The exams are designed in a way that only dedicated students can pass. They have to do, re-do, and re-do the earlier exams multiple times, but if they do not have self-motivation, they skip the study, and they fail. The online education unfortunately cannot give you detailed mental care day by day. Students have to be matured in this regard.
Meaure preparation level: Hardly a student from any major, be it a top schools' STEM, we find them not prepared enough to follow mathematical intuitions thrown in classes. We designed the admission exam one-level below their desired study, so if they fail, that means they are not even ready to do the lower level studies.
Introduction to challenge: Students indeed are aware of challenges ahead of them, but the depth is often shallow. 1~2 courses below the real challenge so far consistently helped us to convince students that they need loads of work to do, if they want to survice.

Selfdom there are well-prepared students. The gifted ones will likely be awarded with scholarships and other activities in and around the school. But most other students are not, and that is why there is a school. It is just that, given the high dropout in STEM, it is the school's job to give out right information and pick the right student.

Picture

Member for

8 months 1 week

Real name

Catherine Maguire

Bio

Founding member of GIAI
Professor of Data Science @ GSB

Following AI hype vs. Studying AI/Data Science

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

April 27, 2024

Modified

June 9, 2025

People following AI hype are mostly completely misinformed
AI/Data Science is still limited to statistical methods
Hype can only attract ignorance

As a professor of AI/Data Science, I from time to time receive emails from a bunch of hyped followers claiming what they call 'recent AI' can solve things that I have been pessimistic. They usually think 'recent AI' is close to 'Artificial General Intelligence', which means the program learns by itself and it is beyond human intelligence level.

At the early stage of my start-up business, I answered them with quality contents. Soon, I realized that they just want to hear what they want to hear, and criticize people saying what they don't want to hear. Last week, I came across a Scientific American's article about history of automations that were actually people. (Is There a Human Hiding behind That Robot or AI? A Brief History of Automatons That Were Actually People | Scientific American)

AI hype followers' ungrounded dream for AGI

No doubt that many current AI tools are far more advanced than medieval 'machines' that were discussed in the Scientific American article, but human generated AI tools are still limited to pattern finding and abstracting it by featuring common parts. The process requires to implement a logic, be it human found or human's programmed code found, and unfortunately the machine codes that we rely on is still limited to statistical approaches.

AI hype followers claim that recent AI tools have already overcome needs for human intervention. The truth is, even Amazon's AI checkout that they claimed no human casher is needed is founded to be under large number of human inspectors, according to the aforementioned Scientific American article.

As far as I know, 9 out 10, in fact 99 out of 100 research papers in second tier (or below) AI academic journals are full of re-generation of a few leading papers on different data sets with only a minor change.

The leading papers in AI, like all other fields, change computational methodologies for a fit to new set of data and different purposes, but the technique is unique and it helps a lot of unsolved issues. Going down to second tier or below, it is just a regeneration, so top class researchers usually don't waste time on them. The problem is that even the top journals are not open only for ground breaking papers. There are not that many ground breaking papers, by definition. We mostly just go up one by one, which is already ultra painful.

Going back to my graduate studies, I tried to establish a model for high speed of information flow among financial investors that leads them to follow each other and copy the winning model, which results in financial market overshooting (both hype/crash) at an accelerated speed. The process of information sharing that results in suboptimal market equilibrium is called 'Hirshleifer effect'. Modeling that idea into an equation that fits to a variety of cases is a demanding task. Every researcher has one's own opinion, because they need to solve different problems and they have different backgrounds. Unlikely we will end up with one common form for the effect. This is how the science field works.

Hype that attracts ignorance

People outside of research, people in marketing to raise AI hype, and people unable to understand researches but can understand marketers' catchphrases are those people who frustrate us. As mentioned earlier, I did try to persuade them that it is only a hype and the reality is far from the catch lines. I have given up doing so for many years.

Friends of mine who have not pursued grad school sometimes claim that they just need to test the AI model. For example, if an AI engineer claims that his/her AI can win against wall street's top-class fund managers by double to tripple margin, my friends think all they need as a venture capitalist is to test it for a certain period of time.

The AI engineer may not be smart enough to show you failed result. But a series of failed funding attempts will make him smarter. From a certain point, I am sure the AI engineer begins showing off successful test cases only, from the limited time span. My VC friends will likely be fooled, because there is not such an algorithm that can win against market consistently. If I had that model, I would not go for VC funding. I would set up a hedge-fund or I will just trade with my own money. If I know that I can win with 100% probability and zero risk, why share profit with somebody else?

The hype disappears not by a few failed tests, but by no budget in marketing

Since many ignorant VCs are fooled, the hype continues. Once the funding is secured, the AI engineer runs more marketing tools to show off so that potential investors are brain-washed by the artificial success story.

As the test failed multiple times, the actual investments with fund buyers' money also fails. Clients begin complaining, but the hype is still high and the VC's funding is not dry yet. In addition to that, now the VC is desperate to raise the invested AI start-up's value. He/She also lies. The VC maybe uninformed of the failed tests, but it is unlikely that he/she hears complains from angry clients. The VC's lies, however unintentional, support the hype. The hype goes on. Until when?

The hype becomes invisible when people stop talking about. When people stop talk about it? If the product is not new anymore? Well, maybe. But for AI products, if it has no real use cases, then people finally understand that it was all marketing hype. The less clients, and the less words of mouth. To pump up dying hype, the company may put in more budget to marketing. They do so, until it completely runs out of cash. At some point, there is no ad, so people just move onto something else. Finally, the hype is gone.

Then, AI hype followers no longer send me emails with disgusting and silly criticism.

Following AI hype vs. Studying AI/Data Science

On the contrary, there are some people determined to study this subject in-depth. They soon realize that copying a few lines of program codes on Github.com does not make them experts. They may read a few 'tech blogs' and textbooks, but the smarter they are, the faster they catch that it requires loads of mathematics, statistics, and hell more scientific backgrounds that they have not studied from college.

They begin looking for education programs. For the last 7~8 years, a growing number of universities have created AI/Data Science programs. At the very beginning, many programs were focused too much on computer programming, but by the competition of coding boot-camps and accreditational institutions' drive, most AI/Data Science programs in US top research schools (or similar level schools in the world) offer mathematically heavy courses.

Unfortunately, many students fail, because math and stat required to professional data scientists is not just copying a few lines of program codes from Github.com. My institution, for example, runs Bachelor level courses for AI MBA and MSc AI/Data Science for more qualified students. Most students know the MSc is superior to AI MBA, but only few can survice. They can't even understand AI MBA's courses that are par to undergrad. Considering US top schools' failing rates in STEM majors, I don't think it is a surprise.

Those failing students are still better than AI hype followers, so highly unlikely be fooled like my ignorant VC friends, but they are unfortunately not good enough to earn a demaing STEM degree. I am sorry to see them walk away from the school without a degree, but the school is not a diploma mill.

The distance from AI hype to professional data scientists

Graduated students with a shining transcript and a quality dissertation find decent data scientist positions. Gives me a big smile. But then, in the job, sadly most of their clients are mere AI hype followers. Whenever I attend alum gathering, I get to hear tons of complaints from students about the work environment.

It sounds like a Janus-face case to me. On the one side, the company officials hires data scientists because they follow AI hype. They just don't know how to make AI products. They want to make the same or the better AI products than competitors. The AI hype followers with money create this data scientist job market. On the other side, unfortunately the employers are even worse than failing students. They hear all kinds of AI hype, and they just believe all of them. Likely, the orders given by the employers will be far from realistic.

Had the employers had the same level knowledge in data science as me, would they have hired a team of data scientists for products that cannot be engineered? Had they known that there is no AI algorithm that can consistently win against financial markets, would they have invested to the AI engineer's financial start-up?

I admit that there are thousands of unsung heros in this field without much consideration from the market due to the fact that they have never jumped into this hype marketing. The capacity of those teams must be the same as or even better than world class top-notch researchers. But even with them, there are things that can be done and cannot be done by AI/Data Science.

Hype can only attract ignorance.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Don't be (extra) afraid of math. It is just a language

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

April 23, 2024

Modified

June 9, 2025

Math in AI/Data Science is not really math, but a shortened version of English paragraph.
In science, researchers often ask 'plz speak in plain English', a presentation that math is just to explain science in more scientific way.

I liked math until high school, but it became an abomination during my college days. I had no choice but to make records of math courses on my transcript as it was one of the key factors for PhD admission, but even after years of graduate study and research, I still don't think I like math. I liked it when it was solving a riddle.

The questions in high school textbooks and exams are mostly about finding out who did what. But the very first math course in college forces you to prove a theorem, like 0+0=0. Wait, 0+0=0? Isn't it obvious? Why do you need a proof for this? I just didn't eat any apple, so did my sister. So, nobody ate any apple. Why do you need lines of mathematical proof for this simple concept?

Then, while teaching AI/Data Science, I often claim that math equations in the textbook are just short version of long but plain English. I tell them "Don't be afraid of math. It is just a language." Students are usually puzzled, and given a bunch of 0+0=0 like proof in the basic math textbooks for first year college courses, I get to grasp why my students showed no consent to the statement (initially). So, let me illustrate my detailed back-up.

Math is just a language, but only in a certain context

Before I begin arguing math is a language, I would like to make a clear statement that math is not really a language as in academic defintion of language. The structure of math theorem and corollary, for example, is not a replacement of paragraph with a leading statement and supporting examples. There might be some similarity, given that both are used to build logical thinking, but again, I am not comparing math and language in 1-to-1 sense.

I still claim that math is a language, but in a certain context. My topic of study, along with many other closely related disciplines, usually create notes and papers with math jargons. Mathematicians maybe baffled by me claiming that data science relies on math jargons, but almost all STEM majors have stacks of textbooks mostly covered with math equations. The difference between math and non-math STEM majors is that the math equations in non-math textbooks have different meaning. For data science, if you find y=f(a,b,c), it means a, b, and c are the explanatory variables to y by a non-linear regressional form of f. In math, I guess you just read it "y is a function of a, b, and c."

My data science lecture notes usually are 10-15 pages for a 3-hour-long class. It might look too short for many of you, but in fact I need more time to cover the 15-pager notes. Why? For each page, I condense many key concepts in a few math equations. Just like above statement "a, b, and c are the explanatory variables to y by a non-linear regressional form of f", I read the equations in 'plain English'. In addition to that, I give lots of real life examples of the equation so that students can fully understand what it really means. Small variations of the equations also need hours to explain.

Let me bring up one example. Adam, Bailey, and Charlie have worked together to do a group assignment, but it is unsure if they split the job equally. Say, you know exactly how the work was divided. How can you shorten the long paragraph?

y=f(a,b,c) has all that is needed. Depending on how they divided the work, the function f is determined. If y is not a 0~100 scale grade but a 0/1 grade, then the function f has to reflect the transformation. In machine learning (or any similar computational statistics), we require logistic/probit regressions.

In their assignment, I usually skip math equation and give a long story about Adam, Bailey, and Charlie. As an example, Charlie said he's going to put together Adam's and Bailey's research at night, because he's got a date with his girlfriend in the afternoon. At 11pm, while Charlie was combining Adam's and Bailey's works, he found that Bailey almost did nothing. He had to do it by himself until 3am, and re-structured everything until 6am. We all know that Charlie did a lot more work than Bailey. Then, let's build it in a formal fashion, like we scientists do. How much weight would you give it to b and c, compared to a? How would you change the functional form, if Dana, Charlie's girlfriend, helped his assignment at night? What if she takes the same class by another teacher and she has already done the same assignment with her classmates?

If one knows all possibilities, y=f(a,b,c) is a simple and short replacement of above 4 paragraphes, or even more variations to come. This is why I call math is just a language. I am just a lazy guy looking for the most efficient way of delivering my message, so I strictly prefer to type y=f(a,b,c) instead of 4 paragraphes.

Math is a univeral language, again only in a certain context

Teaching data science is fun, because it is like my high school math. Instead of constructing boring proof for seemingly an obvious theorem, I try to see hidden structures of data set and re-design model according to the given problem. The diversion from real math is due to the fact that I use math as a tool, not as a mean. For mathematicians, my way of using math might be an insult, but I often say to my students that we do not major math but data science.

Let's think about medieval European countries when French, German, and Italian were first formed by the process of pidgin and creole. In case you are not familiar with two words, pidgin language is to refer a language spoken by a children by parents without common tongue. Creole language is to refer a common language shared by those children. When parents do not share common tongue, children often learn only part of the two languages and the family creates some sort of a new language for internal communication. This is called pidgin process. If it is shared by a town or a group of towns, and become another language with its own grammar, then it is called creole process.

For data scientists, mathematics is not Latin, but French, German, or Italian, at best. The form is math (like Latin alphabet), but the way we use it is quite different from mathematicians. For major European languages, for some parts, they are almost identical. For data science, computer science, natural science, and even economics, some math forms mean exactly the same. But the way scientists use the math equations in their context is often different from others, just like French is a significant diversion from German (or vice versa).

Well-educated intellectuals in medieval Europe should be able to understand Latin, which must have helped him/her to travel across western Europe without much trouble in communication. At least basic communication would have been possible. STEM students with heavy graduate course training should be able to understand math jargons, which help them to understand other majors' research, at least partially.

Latin was a universal language in medieval Europe, so as math to many science disciplines.

Math in AI/Data Science is just another language spoken only by data scientists

Having said all that, I hope you can now understand that my math is different from mathematician's math. Their math is like Latin spoken by ancient Rome. My math is simply Latin alphabet to write French, German, Italian, and/or English. I just borrowed the alphabet system for my own study.

When we have trouble understanding presentations with heavy math, we often ask the presentor, "Hey, can you please lay it out in plain English?"

The concepts in AI/Data Science can be, and should be able to be, written in plain English. But then 4 paragraphes may not be enough to replace y=f(a,b,c). If you need way more than 4 paragraphes, then what's the more efficient way to deliver your message? This is where you need to create your own language, like creole process. The same process occurs to many other STEM majors. For one, even economics had decades of battle between sociology-based and math-based research methods. In 1980s, sociology line lost the battle, because it was not sharp enough to build the scientific logic. In other words, math jargons were a superior means of communication to 4 paragraphes of plain English in scientific studies of economics. Now one can find sociology style economics only in a few British universities. In other schools, those researchers can find teaching positions in history or sociology major. And, mainstream economists do not see them economists.

The field of AI/Data Science evolves in a similar fashion. For once, people thought software engineers are data scientists in that both jobs require computer programming. I guess now in these days nobody would argue like that. Software engineers are just engineers with programming skills for websites, databases, and hardware monitoring systems. Data Scientists do create computer programs, but it is not about websites or databases. It is about finding hidden patterns in data, building a mathematically robust model with explanatory variables, and predicting user behaviors by model-based pattern analysis.

What's still funny is that when I speak to another data scientists, I expect them to understand y=f(a,b,c), like "Hey, y is a function of a, b, and c". I don't want to lay it out with 4 paragraphes. It's not me alone that many data scientists are just as lazy as I am, and we want our counterparties to understand the shorter version. It may sound snobbish that we build a wall against non-math speakers (depsite the fact that we also are not math majors), but I think this is an evident example that data scientists use math as a form of (creole) language. We just want the same language to be spoken among us, just like Japanese speaking tourists looking for Japanese speaking guide. English speaking guides have little to no value to them.

Math in AI/Data Science can be, should be, and must be translated to 'plain English'

A few years ago, I have created an MBA program for AI/Data Science that shares the same math-based courses with senior year BSc AI/Data Science, but does not require hard math/stat knoweldge. I only ask them to borrow the concept from math heavy lecture notes and apply it to real life examples. It is because I wholeheartedly believe that the simple equation still can be translated to 4 paragraphes. Given that we still have to speak to each other in our own tongue, it should be and must be translated to plain language, if to be used in real life.

As an example, in the course, I teach cases of endogeneity, including measurement error, omitted variable bias, and simultaneity. For BSc students, I make them to derive mathematical forms of bias, but for MBA students, I only ask them to follow the logic that what bias is expected for each endogenous case, and what are closely related life examples in business.

An MBA student tries to explain his company's manufacture line's random error that slows down automated process by measurement error. The error results in attenuation bias that under-estimates mismeasured variable's impact in scale. Had the product line manager knew the link between measurement error and attenuation bias, the loss of automation due to that error must have attracted a lot more attention.

Like an above example, some MBA students in fact show way better performance than students in MSc in AI/Data Science, more heavily mathematical track. They think math track is superior, although many of them cannot match math forms to actual AI/Data Science concepts. They fail not because they do not have pre-training in math, but because they just cannot read f(a,b,c) as work allocation model by Adam, Bailey, and Charlie. They are simply too distracted to math forms.

During admission, there are a bunch of stubborn students with a die-hard claim that MSc or death, and absolutely no MBA. They see MBA a sort of blasphamy. But within a few weeks of study, they begin to understand that hard math is not needed unless they want to write cutting edge scientific dissertations. Most students are looking for industry jobs, and the MBA with lots of data scientific intuition was way more than enough.

The teaching medium, again, is 'plain English'.

With the help of AI translator algorithms, I now can say that the teaching medium is 'plain language'.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Korean 'Han river' miracle is now over

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

April 22, 2024

Modified

June 9, 2025

Korean GDP growth was 6.4%/y for 50 years until 2022, but down to 2.1%/y in 2020s.
Due to low birthrate down to 0.7, population is expected to 1/2 in 30 years.
Policy fails due to nationwide preference to leftwing agenda.
Few globally competent companies leave the country.

Financial Times, on Apr 22, 2024, reported that South Korean economic miracle is about to be over. The claim is simple. The 50 years of economic growth from 1970 to 2022 with 6.4%/year is replaced by mere 2.1%/year growth in 2020s, and it will be down to 0.6% in the 2030s and 0.1% in 2040s.

For a Korean descendent, it is no secret that the country's economy has never been active since 1997 when Asian financial crisis had a hard hit to all east Asian countries. Based on IMF's economic analysis, South Korea's national GDP had grown over 7% per year until 1997, and with linear projection, it was expected that South Korea will catch up western European major economies in early 2010s by per capita GNI. Even after the painful recovery until 2002, per year growth rate was over 5% for another a decade, way above pessimistic economists' expectation, whose projection was somewhere near Japan's 1990s, the country nearly stopped growing after the burst of property bubble at the end of 1980s.

The 1970s model worked until 1990s

South Korean economic growth is mostly based on 1970s model that the national government provided subsidy highly export driven heavy industires. The country provide upto 20% of national GDP as a form debt insurnace to Korean large manufacturers for their debt-financing from rich countries. The model worked well up until late 1980s. Economic prosperity by extremely favorable global economic conditions coined as '3-Lows (Low Korean won, Low Petro price, Low interest rate)'. The success helped the economy to rely on short-term borrowing from US, Japan, and other major economies until 1997.

Forced to fire-sale multiple businesses at a bargain price, the passion for growth in the country was gone. Large business owners become extremely conservative on new investments. They also turned heads to domestic market, where competitors are small and weak. SMEs have been wiped out by large conglomerates, most of which were incapable of competiting internationally, thus turning to safe domestic battle.

Had time and money, but collective policy failures killed all

Compared to North Korean economic struggle, South Korea has been the symbol of success in capitalism. Over the 50 years, South Korean per capital GNI has grown from US$1,000 to US33,000, while Northern brothers are still struggling with US$1,000 to US$2,000, depending on agricultural production affected by weather conditions. In other words, while North Korea is still in pre-industrialization economy, the South has grown to a major industrial power with lots of cutting edge technological products, including semiconductors by Samsung Electrics and SK Hynix.

The country was able to keep higher than expected growth up until 2020, largely because of China's massive import. China, since the opening of its economy in 1998 by joing WTO(World Trade Organization). China has been the key buyer of South Korean electric appliances, smartphones, semiconductors, and many other tech products, most of which were crucial for its own economic developement.

But experts have raised attention that China's technological catch-up was a growingly imminent threat to Korean's tech superiority. The gap is now mostly gone. Even the US is now raising a bar high against China for 'security purpose'. The fact that the US has been keen on China's national challenge to semiconductor industry now even to chem and bio is an outstanding proof that China is no longer a tech follower to western key economic leaders, not to mention South Korea.

US-China trade war expedited Korea's fall

By a simple Cobb-Douglas model, with capital and labor, it is easy to guess that capital withdrawal from China resulted in massive surplus to the US market, where the economy is suffering from higher than usual inflation. It's the cost that the US market pays. On the other hand, without capital base, the Chinese economy is going to suffer from capital shock like Asian Financial Crisis of 1997. Facilities are there, but money is gone. Until there is any capital influx to fill the gap, be it by IMF and World Bank like 1997 or long-term internal capital building like Great Britain from 1970s to 2000s, we won't be seeing China's economic rise.

The sluggish Chinese economy deadly affected its neighbors. One of the trading partners that were hit hard is South Korea in Asia, Germany in Europe, and Apple in BigTechs. Germany used to be the symbol of economic growth in Europe, at least during European sovereign debt crisis of 2008-2012. Unlike other big tech companies, Apple kept its dependency to China until very recently. The company lost stock values by 40% since the peak in 2022. South Korean story is not that different. The major trading partner zipped its wallet. 15% to 40% of trade surplus, depending on industries, were disappeared. Korean companies were not ready to replace the loss by other sources.

The evident example that South Korea was not ready to China's withdrawal is its dependency to Aqueous urea solution(AUS) for diesel powered trucks. Over 90%, sometimes upto 100%, of AUS consumption in the nation was from Chinese sources, which was stopped twice recently. In Dec 2021 and Sep 2023, lack of AUS pushed Korea's large cargo freight trucks being inoperable. The country's logistic system was nearly shut down. The government tried to replace it for two years from 2021 to 2023, but the country still failed to avoid another AUS crisis in Sep 2023.

For South Korea, China was a mixed blessing. Dependency to China from 1998 upto 2020 helped the economy to keep high enough growth rate to run the country. But heavy dependency now creates detrimental effects to every corner of the country's industrial base. Simply put, Korea has been too dependent to China.

Education, policy, companies, all failed jointly and simultaneously

Fellow professors in major Korean universities do not expect Korea to rebound anytime soon. The economic growth model that worked in 1970s have not been working as early as late 1990s. The government officials have, however, been ignorant of failing system. Back then, under military regime, only successful business men were given government subsidy. The selection process was tough. Failing businesses were forced to close down, before creating any harm to wider economy.

But the introduction of democratic system that brough freedom to business, press, and civil rights groups deprived the government of total control on resource allocation. The country no longer is operated by a single powerful and efficient planner. While recovering from devastating financial crisis in 1997, every agents in the economy learned that the government is no longer a powerful fatherhood and tasted some economic freedom.

Had the freedom been regulated properly, the economy would have been armed by national support in subsidy, human resources, as well as 50 million domestic customer base. Instead, except a few internationally qualified products, most of them turned their heads inward. For lack of English speaking manpower, companies were not able to compete internationally, unless they have hard and unique products. Building a brand from 'copying machine' to 'tech leader' costs years of endeavor that we can only see successes in RAM chips and K-pop singers.

Korea had time to renew its economic policy. But Chinese honeyspot provided too much illusion that Korean companies thought its superiority will stay forever. Koreans have kept its 'copying machine' policy. The government officials were not as keen as 1970s, thus any sugarcoated success in overseas countries helped Korean companies to be in a position of demand to subsidy. The country stopped grow technologically. Industries, academia, and press become entangled to one goal. Massive exaggeration to earn government subsidy. For one, Korean government wasted US$10 billion just for basic programming courses in K-12 that are no longer needed in the era of Generatvie AI. In the meantime, China was not willing to stay culturally, technologically, and intellectually behind its tiny neighbor that they have looked down for the last two millenia.

While Korean education puts less and less emphasis on math/stat/science, Chinese took opposite steps. Now Korea's the most demanded college major is medical doctoral track, while Chinese put Mathematics as the top major. Despite higher competition to medical track with large expected income, some students no longer pursue medical track just because they are afraid of high school mathematics and science.

Aging society with lowest birthrate in the world

Will there be any hope in Korea? Many of us see otherwise. The country is dying, literally. The median age of the country is 49.5 as of year 2024. Generations born in 1980s had nearly 1 million babies in a year, while in 2020s they only have 200,000. The population, particularly in working age, will be shrunken to 1/5 in a few decades later. Due to touch economic conditions, young couples push marriage as late as 30s and 40s. Babies by women after 35 have shown some level of genetic defect that even 1/5 of working population won't be as effective as today.

Together with mis-guided educational policy, the country is expected to have less capable brain as the time goes. International competition will become more severe due to desperate Chinese catch up in technology. Companies have already lost passion for growth.

Economic reforms have been tried, but the unpopular minority seldom wins in elections. Even if it wins, the opposition is too strong to overcome. Officials expect that the country is on a ticking bomb without any immediate means of defusion.

Though I admit that other major economies are suffering from similar growth fatigue, it is at least evident that South Korea is now on the list of 'no-hope'. If you are looking for growth stocks, go and look for other countries.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Why was 'Tensorflow' a revolution, and why are we so desperate to faster AI chips?

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

April 11, 2024

Modified

June 9, 2025

Transition from column to matrix, matrix to tensor as a baseline of data feeding changed the scope of data science, 
but faster results in 'better', only when we apply the tool to the right place, with right approach.

Back in early 2000s, when I first learned Matlab to solve basic regression problems, I was told Matlab is the better programming tool because it runs data by 'Matrix'. Instead of other software packages that feed data to computer system by 'column', Matlab loads data with larger chunk at once, which accelerates processing speed by O(nxk) to O(n). More precisely, given how the RAM is fed by the softwares, it was essentially O(k) to O(1).

Together with a couple of other features, such as quick conversion of Matlab code to C code, Matlab earned huge popularity. A single copy was well over US$10,000, but companies with deep R&D and universities with significant STEM research facilities all jumped to Matlab. While it seemed there were no other competitors, there was a rising free alternative, called R, that had packages handling data just like Matlab. R also created its own data handler, which worked faster than Matlab for loop calculation. What I often call R-style (like Gangnam style), replaced loop calculations from feeding column to matrix type single process.

R, now called Posit, became my main software tool for research, until I found it's failure to handling imaginary numbers. I had trouble reconciliating R's outcome with my hand-driven solution and Matlab's. Later, I ended up with Mathematica, but given the price tag attached to Mathematica, I still relied on R for communicating with research colleagues. Even after prevailing Python data packages, upto Tensorflow and PyTorch, I did not really bother to code in Python. Tensorflow was (and is) also available on R, and there was not that much speed improvement in Python. If I wanted faster calculation for multi-dimensional tasks that require Tensorflow, I coded the work in Matlab, and transformed to C. There initially was a little bug, but the Matlab's price tag did worth the money.

A few years back, I found Julia, which has similar grammar with R and Python, but with C-like speed in calculations with support for numerous Python packages. Though I am not an expert, but I feel more conversant with Julia than I do to Python.

When I pull this story, I get questions like wy I traveled around multiple software tools? Have my math models become far more evolved that I required other tools? In fact, my math models are usually simple. At least to me. Then, why from Matlab to R, Mathematica, Python, and Julia?

Since I only had programming experience from Q-Basic, before Matlab, I really did not appreciate the speed enhancement by 'Matrix'-based calculations. But when I switched to R, for loops, I almost cried. It almost felt like Santa's Christmas package had a console gamer that can play games that I have dreamed of for years. I was able to solve numerous problems that I had not been able to, and the way I code solution also got affected.

The same transition affected me when I first came across 'Tensorflow'. I am not a computer scientist, so I do not touch image, text, or any other low-noise data, so the introduction of tensorflow by computer guys failed to earn my initial attention. However, on my way back, I came to think of the transition from Matlab to R, and similar challenges that I had had trouble with. There were a number of 3D data sets that I had to re-array them with matrix. There were infinitely many data sets in shape of panel data and multi-sourced time series.

When in search for right stat library that can help solving my math problems in simple functions, R usually was not my first choice. It was mathematica, and it still is, but since the introduction of tensorflow, I always think of how to leverage 3D data structure to minimize my coding work.

Once successful, it not only helps me to save time in coding, but it tremendously changes my 'waiting' time. During my PhD, for one night, the night before supposed meeting with my advisor, I found a small but super mega important error in my calculation. I was able to re-derive closed solutions, but I was absolutely sure that my laptop won't give me a full-set simulation by the next morning. I cheated with the simulation and created a fake graph. My advisor was a very nice guy to pinpoint something was wrong with my simluation within a few seconds. I confessed. I was too in a hurry, but I should've skipped that week's meeting. I remember it took me years to earn his confidence. With faster machine tools that are available these days, I don't think I should fake my simulation. I just need my brain to process faster, more accurately, and more honestly.

After the introduction of H100, many researchers in LLM feel less burden on handling massive size data. As AI chips getting faster, the size of data that we can handle at the given amount of time will be increasing with exponential capacity. It will certainly eliminate cases like my untruthful communication with the advisor, but I always ask myself, "Where do I need hundreds of H100?"

Though I do appreicate the benefits of faster computer processing and I do admit that the benefits of cheaper computational cost that opens opportunities that have not been explored, it still needs to answer 'where' and 'why' I need that.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Why market interest rates fall every day while the U.S. Federal Reserve waits and why Bitcoin prices continue to rise

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

December 19, 2023

Modified

June 9, 2025

When an expectation for future is shared, market reflects it immediately
US Fed hints to lower interest rates in March, which is already reflected in prices
Bitcoin prices also rely on people's belief on speculative demands

The US Fed determines the base interest rate approximately once every 1.5 months, eight times a year. There is no reason for the market to immediately follow the next day when the Federal Reserve sets an interest rate, and in fact, changing the base rate or target interest rate does not mean that it can change the market the next day, but it is a method of controlling the amount of money supplied to general banks, It is common for interest rates to be substantially adjusted within one to two weeks by appropriately utilizing methods such as controlling bond sales volume.

The system in which most central banks around the world set base interest rates in a similar way and the market moves accordingly has been maintained steadily since the early 1980s. The only difference from before was that the money supply was the target at that time, and now the interest rate is the target. As experience with appropriate market intervention accumulates, the central bank also learns how to deal with the market, and the market also changes according to the central bank's control. The experience of becoming familiar with interpreting expressions goes back at least 40 years, going back as far as the Great Depression in the United States in 1929.

However, the Federal Reserve declared that it is not time to lower interest rates yet and that it will wait until next year, but interest rates at commercial banks are lowering day after day. I briefly looked at the changes in US interest rates in the Financial Times, and saw that long-term bond interest rates were falling day by day.

Why is the market interest rate lowering while the Federal Reserve remains silent?

Realization of expectations

Let’s say that in one month, the interest rate falls 1% from now. Unless you need to get a loan tomorrow, you will have to wait a month before going to the bank. No, these days, you can send documents through apps and non-face-to-face loans are also active, so you won't have to open your banking app and open the loan menu for a month.

From the perspective of a bank that needs to make a lot of loans to secure profitability, if the number of such customers increases, it will have to suck its fingers for a month. What happens if there is a rumor that interest rates will fall further in two months? You may have to suck only your fingers for two months.

Let’s put ourselves in the position of a bank branch manager. In any case, it is expected that the central bank will lower interest rates in a month, and everyone in the market knows that, so it is not a post-reflection where interest rate adjustments are hastily made in the market after the central bank announcement, but everyone is not interested in the announcement date. If it is certain that it will be reflected in advance, there will be predictions that the market interest rate will be adjusted sooner than one month. Since you have worked your way up to the branch manager level, you clearly know how the industry is going, so you can probably expect to receive a call from the head office in two weeks to lower the interest rate and ask for loans and deposits. However, the only time a loan is issued on the same day after receiving the loan documents is when the president's closest aide comes and makes a loud noise. Usually, more than a week is spent on review. There are many cases where it takes 2 weeks or 1 month.

Now, as a branch manager with 20+ years of banking experience who knows all of this, what choice would you make if it was very certain that the central bank would lower interest rates in one month? You have to build up a track record by giving out a lot of loans to be able to look beyond branch manager, right? We have to win the competition with other branches, right?

Probably a month ago, he issued an (unofficial) work order to his branch staff to inform customers that loan screening would be done with lower interest rates, and while having lunch with wealthy people nearby, he said that his branch would provide loans with lower interest rates, and talked to good people around him about it. We will introduce you to commercial buildings. They say that you can make money if you buy something before someone else does.

When everyone has the same expectation, it is reflected right now

When I was studying for my doctorate in Boston, there was so much snow in early January that all classes were cancelled. Then, in February, when school started late, a professor emailed us in advance to tell us to clear our schedules, saying that classes would be held on from Monday to Friday.

I walked into class on the first day (Monday), and as the classmates were joking that we would see each other every day that week, and the professor came to the classroom. And then to us

I'm planning to take a 'Surprise quiz' this week.

We were thinking that the eccentric professor was teasing us with strange things again. The professor asked again when they would take the surprise quiz. For a moment, my mind raced: When will be the exam? (The answer is in the last line of the explanation below.)

If there is no Surprise Quiz by Thursday, Friday becomes the day to take the Quiz. It's no longer a surprise. So Friday cannot be the day to take the Surprise quiz.

What happens if there is no surprise quiz by Wednesday? Since Friday is not necessarily the day to take the Surprise quiz, the remaining day is Thursday. But if Friday is excluded and only Thursday remains, isn't Thursday also a Surprise? So it's not Thursday?

So what happens if there is no Surprise quiz by Tuesday? As you can probably guess by now, Friday, Thursday, Wednesday, and Tuesday do not all meet the conditions for Surprise by this logic. What about the remaining days?

It was Monday, right now, when the professor spoke.

As explained above, we are told to take out a piece of paper, write our names, write an answer that logically explains when the Surprise quiz will be, and submit it. I had no idea, but then I suddenly realized that the answer I had to submit now was the answer to the Surprise quiz, so I wrote the answer above and submitted it.

The above example is a good explanation of why the stock price of a company jumps right now if you predict that the stock price of that company will rise fivefold in one month. In reality, the stock market determines stock prices based on the company's profitability over two or three quarters, not on its profitability today. If the company is expected to grow explosively during the second or third quarter, this will be reflected in advance today or tomorrow. The reason it is delayed until tomorrow is due to regulations such as daily price limits and the time it takes to spread information. Just as there is a gap between students who can submit answers to test questions right away and students who need to hear explanations from their friends after the test, the more advanced information is, the slower its spread may be.

Everyone knows this, so why does the Fed say no?

Until last October and November, at least some people disagreed with the claim that an interest rate cut would be visible in March of next year. As there is growing confidence that the US will enter a recession in December, there is now talk of lowering interest rates at a meeting on January 31st rather than in March. Wall Street financial experts voted for a possibility that was close to 10%, which was only 0% just a month ago. Meanwhile, Federal Reserve Chairman Powell continues to evade his comments, saying that he cannot yet definitively say that he will lower interest rates. We all know that even if we don't know about January, we are sure about March, but he has much more information than us, and there are countless economics doctors under him who will research and submit reports, so why does he react with such ignorance? Should I do it?

Let's look at another example similar to the Surprise quiz above.

When the professor entered the first class of the semester, he announced that the grade for this class would be determined by one final exam, and that he planned to make it extremely difficult. Many students who were trying to earn credits day by day will probably escape during the course adjustment period. The remaining students have a lot of complaints, but they still persevere and listen carefully to the class, and later on, because the content is too difficult, they may form a study group. Let's imagine that it's right before the final exam and your professor knows that you studied so hard.

The professor's original goal was for students to study hard, not to harass them by giving difficult test questions. Writing tests is a hassle, and grading them is even more bothersome. If you have faith that the remaining students will do well since you kicked out the students who tried to eat raw, it may be okay to just give all the remaining students an A. Because everyone must have studied hard.

When I entered the exam room,

No exam. You all have As. Merry Christmas and Happy New Year!

Isn’t it written like this?

From the students' perspective, they may feel like they are being made fun of and that they feel helpless. However, from the professor’s perspective, this decision was the best choice for him.

Students who tried to eat it raw were kicked out.
The remaining students studied hard.
Reduced the hassle of writing test questions
You don't have to grade
When entering your grade, you only need to enter the A value.
No more students complaining about grading.

The above example is called 'Time Inconsistency' in game theory, and is often used as a general example of a case where the optimal choice varies depending on time. Of course, if we continue to use the same strategy, 'students who want to eat raw' will flock to register for the next semester. So, in the next semester, you must take the exam and become an 'F bomber' who gives a large number of F grades. At a minimum, students must use the Time Inconsistency strategy at unpredictable intervals for the strategy to be effective.

The same logic can be applied to Federal Reserve Chairman Powell. Although interest rates are scheduled to be lowered in March or January next year, if they remain silent until the end, it could reflect their will to prevent overheating of the economy by raising interest rates. Then, if interest rates are suddenly lowered, an economic recession can be avoided.

Those who do macroeconomics summarize this with the expressions ‘discretion’ and ‘rules.’ 'Discretion' refers to government policy that responds in accordance with market conditions, and 'rules' refers to a decision-making structure that ignores market conditions and moves in accordance with standard values. Generally, a structure that promotes 'rules' on the outside and uses 'discretion' behind the scenes. has worked like a market rule for the past 40 years.

Because of this accumulated experience, sometimes the central banker sticks to the 'rules' until the end and devises a defensive strategy so that the market does not expect 'discretion', and sometimes he comes up with a strategy to respond faster than the market expects. These are all choices made to show that market expectations are not unconditionally followed by using Time Inconsistency or vice versa.

Examples

Such cases of surprise quizzes and no exams can often be found around us.

Although products like Bitcoin are nothing more than 'digital pieces' with no actual value, there are some people who have a firm belief that it will become a new currency replacing the central government's currency, and some who are not sure about currency and just buy it because the price goes up. Prices fluctuate repeatedly due to the buying and selling actions of the (overwhelming) majority of like-minded investors. The logic of a surprise quiz is hidden in the behavior of buying because it seems like it will go up, and in the attitude of never admitting it and insisting on the value until the end, even though you know in your heart that it is not actually worth it, there is a central bank-style strategy using no exam hidden. .

The same goes for the behavior of 'Mabari', a so-called securities broker who raises the stock price of theme stocks by creating wind, and the sales pitch of academies that say you can become an AI expert with a salary in the hundreds of millions of dollars by simply obtaining a code is also the same. They all cleverly exploit the asymmetry of information, package tomorrow's uncertain value as if it is great, and sell today's products by inflating their value.

Although it is not necessarily a case of fraud, cases where value is reflected in advance are common around us. If the price of an apartment in Gangnam looks like it will rise, it rises overnight, and if it looks like it will fall, it moves several hundred million won in a single morning. This is because the market does not wait and immediately reflects changed information.

Of course, this pre-reflected information may not always be correct. You will often hear the expression ‘over-shooting’, which refers to a situation where the market overreacts and stock prices rise excessively, or real estate prices fall excessively. There may be many reasons, but it happens because people who follow what others say and brainwash their brains do not accurately reflect the value of information. Generally, in the stock market, if there is a large rise for one or two days, the stock price tends to fall slightly the next day, which is a clear example of 'overshooting'.

Can you guess when the interest rate will drop?

Whenever I bring up this topic, the person who was dozing off wakes up at the end and asks, 'Please tell me when the interest rate will go down.' He says he can't follow complicated logic, he just needs to know when the interest rate goes down.

If you have been following the story above, you will be predicting that interest rate adjustments will continue to occur in the market between the Christmas and New Year holidays before the central bank lowers interest rates. It is unclear whether the decision to lower interest rates will be made on January 31 or March 20 next year. Because it’s their heart. Economic indicators are just numbers, and ultimately, they are values that only move when people make decisions that risk their future reputations, but I can't get into their minds.

However, since they also have the rest of their lives, they will try to make rational decisions, and those who are smart enough to solve the Surprise quiz on the spot will adjust their expectations the fastest and become market readers, and those who solve the problem will become the market readers. People who have heard of it and know about it will miss the opportunity due to the information time lag, and people who ask 'just tell me when it will arrive' will only respond belatedly after the whole incident has occurred. While you're sending emails asking who's right, you'll find out later that the market correction is over. To paraphrase, it is already coming down. The 30-year maturity bond interest rate, which was close to 5.0% a month ago, fell to 4.0%?

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

The process of turning web novels into webtoons and data science

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Published

November 20, 2023

Modified

June 9, 2025

Web novel to Webtoon conversion is not only based on 'profitability'
If the novel author is endowed with money or bargaining power, 'Webtoonization' may be nothing more than a marketting tool for the web novel.
Data science modeling based on market variables unable to grab such cases

A student in SIAI's MBA AI/BigData progam, struggling with her thesis, chose her topic as the condition for turning a web novel into a webtoon. In general, people would simply think that if the number of views is high and the sales volume of the web novel is large, a follow-on contract with a webtoon studio will be much easier. She brought in a few reference data science papers, but they only looked into publicly available information. What if the conversion was the choice of the web novel author? What if the author just wanted to spend more marketing budget by adding webtoon in his line-up?

Literature mostly runs hierarchical structures during 'deep learning' and use 'SVM', a task that simply relies on computer calculations, and calculate the number of all cases provided by the Python library. Sorry to put it this way, but such calculations are nothing more than a waste of computer resources. It has also been pointed out that the crude reports of such researchers are still registered as academic papers.

Put all crawled data into 'AI', then it will swing a majic wand?

Converting a web novel into a webtoon can be seen as changing a written story book into an illustrated story book. Professor Daeyoung Lee, Dean of the Graduate School of Arts at Chung-Ang University, explained that the change to OTT is a change to video story books.

The reason this transition is not easy is because the transition costs are high. Domestic webtoon studios have a team of designers ranging from as few as 5 to as many as dozens of designers, and the market has been differentiated considerably into a market where even a small character image or pattern that seems simple to our eyes must be purchased and used. After paying all the labor costs and purchasing costs for characters, patterns, etc., it still takes $$$ to turn a web novel into a webtoon.

This is probably the mindset of typical 'business experts' to think that manpower and funds will be concentrated on web novels that seem to have a high possibility of success as webtoons, as investment money is invested and new commercialization challenges are required.

However, the market does not operate solely on the logic of capital, and 'plans' based on the logic of capital are often wrong due to failing to read the market properly. In other words, even if you create a model by collecting data such as the number of views, comments, and purchases provided by platforms and consider the possibility of webtoonization and the success of the webtoon, it is unlikely that it will actually be correct.

One thing to point out here is that although there are many errors due to market uncertainty, there are also a significant number of errors due to model inaccuracy.

Wrong data, wrong model

For those who simply think that 'deep learning' or 'artificial intelligence' will take care of it, creating a model incorrectly means using a less suitable algorithm when one of the 'deep learning' algorithms is said to be a better fit, or worse. It will result in the understanding that good artificial intelligence should be used, but less good artificial intelligence is used.

However, which 'deep learning' or 'artificial intelligence' is a good fit and which one is not a good fit is a matter of lower priority. What is really important is how accurately you can capture the market structure hidden in the data, so you must be able to verify whether it fits well not only by chance in the data selected today, but also consistently fits well in the data selected in the future. Unfortunately, we have already seen for a long time that most 'artificial intelligence'-related papers published in Korea intentionally select and compare data from well-matched time points, and professors' research capabilities are judged simply by the number of K-SCI papers, and the papers are compared. We cannot help but point out that proper verification is not carried out due to the Ministry of Education's crude regulations regarding which academic journals that appear frequently are good journals.

The calculation known as 'deep learning' is simply one of the graph models that finds nonlinear patterns in a more computationally dependent manner. In natural language that must be used according to grammar, computer games that must be operated according to rules, etc., there may be no major problems in use because the probability of errors in the data itself is close to 0%, but the above webtoonization process is not expected to respond in the market. There may be problems that are not resolved, and the decision-making process for webtoons is likely to be quite different from what an outsider would see.

Simply put, it can be pointed out that the barriers given to writers who already have a successful 'track record' are completely different from the barriers given to new writers. Kang Full, a writer who recently achieved great success with 'Moving', explained in an interview that he started with the intellectual property rights of webtoons from the beginning, and that he made major decisions during the transition to OTT. This is a situation that ordinary web novel and webtoon writers cannot even imagine. This is because most web novel and webtoon platforms can sell their content on the platform through contracts that retain intellectual property rights for secondary works.

How much of it is possible for an author to decide whether to make a webtoon or an OTT, reflecting his or her own will? If this proportion increases, what conclusion will the ‘deep learning’ model above produce?

The general public's way of thinking does not include cases where webtoons and OTT adaptations are carried out at the author's will. The 'artificial intelligence' models mentioned above will only explain what percentage of the 'logic of capital' that operates inside the web novel and webtoon platform is correct. However, as soon as the proportion of 'author's will' instead of 'logic of capital' is reflected increases, that model will judge the effects of variables we expected to be much lower, and conversely, it will appear as if the effects of unexpected variables are higher. In reality, it was simply because we failed to include an important variable called 'author's will' that should have been reflected in the model, but since we did not even consider that part, we only ended up with an absurd story with an absurd title of 'Webtoonization process informed by artificial intelligence'.

Before data collection, understand the market first

It has now been two months since the student brought that model. For the past two months, I have been asking her to properly understand the market situation to find the missing pieces in the webtoonization process.

From my experience with business, I have seen that even though the company thought that it could take on an interesting challenge with enough data, it could not proceed due to the lack of the ‘Chairman’s will’. On the other hand, companies that were completely unprepared or did not even have the necessary manpower said, ‘This is the story you heard from the Chairman.’ I've seen countless times where they come up with absurd project ideas saying they're going to proceed 'as usual', and then only IT developers are hired without data science experts, and the work of copying open libraries from overseas markets is repeated.

Considering the amount of capital and market conditions that are also required for the webtoonization process, it is highly likely that a significant number of webtoons will be included in web novel writers' new work contracts in the form of a 'bundle', which is naturally included to attract already successful web novel writers, and generate profits. In the case of writers who want to dominate the webtoon studio, they are likely to sign a contract with the webtoon platform by signing a contract with the webtoon studio themselves and starting to serialize the webtoon after the first 100 or 300 episodes of the web novel are released. From the perspective of a web novel writer who has already experienced that profits increase due to the additional promotion of the web novel as the webtoon is developed, there are cases where the webtoon product is viewed as one of the promotional strategies to sell their intellectual property (IP) at a higher price. It happens.

To the general public, this 'author's will' may seem like an exception, but even if the above proportion of web novels converted to webtoons exceeds 30%, it becomes impossible to explain webtoons using data collected through general thinking. In a situation where there are already various market factors that make it difficult to increase accuracy, and in a situation where more than 30% is driven by other variables such as 'the author's will' rather than 'market logic', how can data collected through general thinking lead to a meaningful explanation? Can I?

Data science is not about learning ‘deep learning’ but about building an appropriate model

In the end, it comes back to the point I always give to students. It is pointed out that 'we must understand reality and find a model that fits that reality.' In plain English, the expression changes to the need to find a model that fits the 'Data Generating Process (DGP)', but the explanatory model related to webtoonization above is a model that does not currently take 'DGP into consideration' at all. If scholars are in a situation where they are listening to the same presentation, complaints such as 'Who on earth selected the presenters' may arise, and there will be many cases where they will just leave even if they are criticized for being rude. This is because such an announcement itself is already disrespectful to the attendees.

In the above situation, in order to create a model that can be considered for DGP, you must have a lot of background knowledge about the web novel and webtoon markets. It does not reflect factors such as how web novel writers on major platforms communicate with platform managers, what the market relationship between writers and platforms is like, and to what extent and how the government intervenes, and simply inserts materials scraped from the Internet. There is no point in simply doing the work of ‘putting data into’ the models that appear in ‘artificial intelligence’ textbooks. If an understanding of the market can be derived from that data, it would be an attractive data work, but as I keep saying, if the data is not in the form of natural language that follows grammar or a game that follows rules, it will only be a waste of computer resources with no meaning. It's just that.

I don't know whether that student will be able to do some market research to destroy my counterargument at the meeting next month, or whether he will change the detailed structure of the model based on his understanding of the market, or worse, whether he will change the topic. What is certain is that a 'paper' with the name 'data' as a simple way to put the collected data into a coding library will end up being nothing more than a 'mixed-up code' containing only one's own delusions and a 'novel filled with text only'.

Picture

Member for

9 months 1 week

Real name

Keith Lee

Bio

Professor of AI/Data Science @SIAI
Senior Research Fellow @GIAI Council
Head of GIAI Asia

Member for

Published

Modified

Brains go for brains

Wrong practice that drives out brain

What to do and not to do

Member for

Similar Post

Member for

Published

Modified

LLMs will replace 'copying boys/girls'

But still correlation does not necessarily mean causality

AI can replace not (intelligent) jobs but (boring) tasks

Member for

Similar Post

Member for

Published

Modified

Post Hoc, Ergo Propter Hoc

Why AI ultimately needs human approval?

Causal inference is not monotonically increasing challenge, but multi-dimensional problem

Some untrained engineers claim magic with AI

Member for

Similar Post

Member for

Published

Modified

Not just for the gifted bright kids

Member for

Similar Post

Member for

Published

Modified

AI hype followers' ungrounded dream for AGI

Hype that attracts ignorance

The hype disappears not by a few failed tests, but by no budget in marketing

Following AI hype vs. Studying AI/Data Science

The distance from AI hype to professional data scientists

Member for

Similar Post

Member for

Published

Modified

Math is just a language, but only in a certain context

Math is a univeral language, again only in a certain context

Math in AI/Data Science is just another language spoken only by data scientists

Math in AI/Data Science can be, should be, and must be translated to 'plain English'

Member for

Similar Post

Member for

Published

Modified

The 1970s model worked until 1990s

Had time and money, but collective policy failures killed all

US-China trade war expedited Korea's fall

Education, policy, companies, all failed jointly and simultaneously

Aging society with lowest birthrate in the world

Member for

Similar Post

Member for

Published

Modified

Member for

Similar Post

Member for

Published

Modified

Realization of expectations

When everyone has the same expectation, it is reflected right now

Everyone knows this, so why does the Fed say no?

Examples

Can you guess when the interest rate will drop?

Member for

Similar Post

Member for

Published

Modified

Put all crawled data into 'AI', then it will swing a majic wand?

Wrong data, wrong model