2023 Kaggle AI Report
2023 Kaggle AI Report
AI Report 2023 3
TABLE OF CONTENTS
06
ESSAYS [continued] AI Ethics
04
● Section Overview, Parul Pandey
Tabular / Time Series Data ● "Exploring the landscape of AI Ethics", Patrik
Joslin Kenfack, Meghana Bhange, Maryam
● Section Overview, Bojan Tunguz
Babaei, Ivaxi Sheth, Dave Harold Mbiazi Njanda
● "Learnings from the Typical Tabular Modelling
● "Developments in AI and Ethics in the past 2
Pipeline", Rhys Cook
years", Antong C.
● "AI Report: Time Series and Tabular Data",
● "Ethical AI is all we need!!", Shreya Mishra,
Chuandong Tang, Paulina Skorupska
Piyush Mathur, Raghav Awasthi, Anya Mathur,
● "Tabular Data in the Age of AI", Kobbie Manrique
Harshit Mishra
05
Kaggle Competitions
AI Report 2023 4
TABLE OF CONTENTS
ESSAYS [continued]
07
Other Topics
AI Report 2023 5
Introduction
AI Report 2023 6
The world of AI has seen breathtaking progress over recent years, with rapid advances in
the capabilities of models as large as ChatGPT, Llama, and PaLM, and as small as those
that can fit on device or in a web browser. The advances of AI have not been confined
only to models: we have also seen an incredible spread of knowledge and expertise
across the globe, with AI experts participating in the field from every corner of the world
and every walk of life.
At Kaggle, we believe that this global community of AI and ML experts – now 15 million
Foreword members strong – is one of the most valuable open resources in the world today. Our
community works together to learn, share, compete, collaborate, stress test, and evaluate
D. Sculley what really works in AI and ML, and does so in a deeply rigorous fashion.
It is a great pleasure to welcome you to the 2023 Kaggle AI Report, created by our
community and selected from hundreds of submissions. Each paper within this report
gives a unique viewpoint on the most interesting or most important recent developments
in the field of AI and ML.
Enjoy the reading, and as always, our deepest thanks to the millions of members of the
Kaggle community!
Report we feel represent significant areas within the research and practice of modern ML.
The submissions were evaluated and edited by a member of our community with
Phil Culliton noteworthy expertise in their section’s area. Each expert selected the winner in
their section, as well as a number of honorable mentions.
AI Report 2023 8
About the Report (Cont.), Phil Culliton
Sections
Section includes: ● Kaggle competitions: Kaggle is perhaps most famous
for its competitions, curated and led with care by our
● Generative AI: A frontier of machine learning that is team of experts, in partnership with hosts including
coming into new focus in the last few years, this area researchers, educators, and industry giants.
has combined the best of text and image research to Competitions present cutting-edge problems and
create a fundamental shift in the usability and utility of challenges that our diverse community solves in myriad
machine learning models. exciting ways.
● Text data: Natural language processing and statistical ● AI ethics: Humans in the world of machine learning
language modeling are the backbone of some of the have always struggled to make our discoveries fair,
most exciting recent advances in AI. unbiased, and equitable for other people.
● Image / video data: Image data was the foundation for ● Other: The above six sections are extensive, but
early advances in deep learning and the past decade is ultimately machine learning and artificial intelligence are
a testament to the ingenuity of researchers and oceans of possibility. We include a section for some
practitioners in this area, with ever-expanding datasets fascinating essays that don’t quite fit into any of the
and problem formulations met with fresh, new ideas. others.
● Tabular / time series data: Tabular data problems are
the most common type of problem to solve, and an area
that has led to incredible research and development –
some by our very own community members.
AI Report 2023 9
About the Report (Cont.), Phil Culliton
The essays in our report were written as notebooks, a rich, multimedia communication form that can include text,
images, video, and even runnable code. Please be sure to click on the link for each essay to fully explore the
experience that the authors created.
Area Chairs
We created this report with the understanding that our community holds a sum of ability and knowledge far
greater than our small team of editors. We reached out to prominent members of our community with
backgrounds and proven skill in each of the report’s 7 topic areas to act as judges and expert editors. These
members of the community are our Area Chairs.
AI Report 2023 10
About the Report (Cont.), Phil Culliton
Sanyam Bhutani drinks chai and makes content for Christof is a Deep Learning Researcher at NVIDIA. He
the community at H2O.ai. When not drinking chai, you is particularly interested in novel deep learning
can find him hiking the Himalayas often with LLM architectures with respect to graphs, computer vision
papers. He is best known for chai, GPUs and and audio. His background is in mathematics and he
mountains. He is the host of the most popular Kaggle completed his PhD at the
podcast with top Kaggle Grandmaster interviews on Ludwig-Maximilians-University in Munich about
Chai Time Data Science. On the internet, he's known stochastic processes and financial markets. He started
for learning in public and "maximizing compute per Kaggling six years ago and after participating in more
cubic inch of ATX". than 70 competitions currently holds first rank in the
world-wide competition ranking.
AI Report 2023 11
About the Report (Cont.), Phil Culliton
Martin has a Ph.D. in astrophysics and an academic Karnika Kapoor is a data scientist with a background in
background in researching exploding stars in nearby mechanical engineering. She is skilled in machine
galaxies. He got into the fields of data science and learning and AI, and is currently employed as a senior
machine learning through Kaggle, where he became the data scientist where she focuses on growth
first ever Kaggle Notebooks Grandmaster and for a time optimization and logistics. She is actively engaged
held the number one spot in the notebooks rankings. with the Kaggle community and became a Kaggle
He has spoken at various conferences about his Notebooks Grandmaster in part due to her ability to
passion for effective data communication and transform data into well-communicated insights. With
storytelling, and he curated 100 episodes of the Hidden a passion for learning, Karnika stays AI-current,
Gems series on underrated Kaggle Notebooks (example offering a unique blend of technical prowess and
post and dataset). Martin is the Lead Data Scientist for data-driven acumen.
the market research company YipitData.
AI Report 2023 12
About the Report (Cont.), Phil Culliton
Rob Mulla has over 10 years experience working with Parul Pandey has a background in electrical engineering
data, using his skills at companies in sectors including and currently works as a Principal Data Scientist at
pharmaceuticals, hospitality, energy, and sports sciences. H2O.ai. She is also a Kaggle Grandmaster in the
His journey with data has led him to become a 4x Kaggle notebooks category, where she composes compelling
Grandmaster, where he has had the chance to participate stories through the medium of notebooks. Her strength
alongside some of the best in the field. Outside of work, lies in analyzing data and eliciting useful insights from
Rob enjoys sharing his knowledge of Python and data them with the help of powerful visuals. Parul is one of
science on his YouTube channel, @robmulla, which has the co-authors of the Machine Learning for High-Risk
garnered a community of over 100,000 subscribers. Rob Applications textbook which focuses on the responsible
is an alumnus of Virginia Tech with a BS, Kansas State implementation of AI. Parul has written multiple articles
University with an MSE, and UC Berkeley where he focused on data science and she mentors, speaks, and
earned his Masters in information and data science. delivers workshops on topics related to responsible AI.
AI Report 2023 13
About the Report (Cont.), Phil Culliton
AI Report 2023 14
Essays
AI Report 2023 15
01
Generative AI
AI Report 2023 16
Generative AI
AI Report 2023 17
Generative AI
AI Report 2023 18
Generative AI
AI Report 2023 19
Generative AI Essay #1
Spanning 2021 to 2023, this essay charts the profound The essay carefully addresses ethical concerns tied to
evolution of generative AI, spotlighting strides in image AI's capabilities. Looking forward, it acknowledges
synthesis, language models, and audio generation. ongoing research, tools like LangChain and AutoGPT,
Noteworthy innovations like GPT4, DALL-E, and and the promise of innovative, ethically conscious AI
ChatGPT take center stage, propelling AI-generated integration.
content into new realms. This narrative journey
navigates pivotal years: 2021 witnesses DALL-E's In summary, this essay captures the dynamic trajectory
text-to-image feats and Github Copilot's code of generative AI's evolution, showcasing significant
suggestions, 2022 showcases Meta's contributions, achievements, addressing challenges, and envisioning a
ChatGPT's emergence, and 2023 highlights the rapid future where AI's potential is harnessed responsibly.
rise of multimodal AI with GPT4's unveiling.
Link to Notebook
AI Report 2023 20
Generative AI Essay #2
This essay traces the evolution and growing influence of transformative structures such as GANs and Vision
generative AI across many domains. With an adept blend Transformers, offering a glimpse into the collaborative
of historical insight and technological exploration, the future of multimodal learning.
essay delves into the societal significance and intricate
mechanisms of generative AI. The essay candidly confronts challenges and ethical
considerations, casting a balanced light on the practical
From its historical roots to the accelerated growth fueled applications and limitations of generative AI. Ultimately,
by Deep Learning, the essay navigates pivotal models the essay comprehensively celebrates generative AI's
like DALL·E 2, highlighting the synergy between present impact while illuminating its promising trajectory
computational resources, dataset scale, and AI potential. into the future.
Bridging Computer Vision and Natural Language
Processing, the exploration showcases
Link to Notebook
AI Report 2023 21
Generative AI Essay #3
This essay explores the innovative journey and paradigm implications of AI-generated content. The essay
shift that generative AI has inspired. From its navigates the challenges, particularly deepfakes, while
resemblance to an enchanted data-powered box that inviting us to witness a revolution that defines a new
generates patterns to its evolution from Boltzmann balance between human ingenuity and generative AI
Machines to Generative Adversarial Networks (GANs), capabilities.
generative AI's impact on image synthesis, text
generation, and more is discussed in detail. Recent
achievements, including the fusion of transformers and
GANs, showcase its collaborative potential in enhancing
human creativity. Amid these strides, ethical
considerations emerge, emphasizing impartial training
data and addressing the societal
Link to Notebook
AI Report 2023 22
02
Text Data
AI Report 2023 23
Text Data
When the entire internet is used as a training corpus for Similar to the benefits of scaling neural network
AI models, knowledge transfer using deep learning architectures, transformer-based models can be trained
becomes an incredibly powerful technique – and the on trillions of tokens and the resulting pre-trained model
major advancements of the past few years are a weights can either be used directly or fine-tuned to be
testament to that. state-of-the-art for a large and diverse number of tasks.
Large language models (LLMs) are unsurprisingly the
Early approaches to small datasets often included primary focus of attention for many or most AI
researchers today.
AI Report 2023 24
Text Data
AI Report 2023 25
Text Data
Overview of Essays
Interestingly, the top essays cover different aspects of
The top essays in this section provide a diverse and LLMs. The winning essay discusses the emergence of
high-quality overview of contemporary LLMs. They contemporary large language models as well as the
notably skip over foundational NLP work that preceded most recent methods and techniques that make them
LLMs, and similarly gave light treatment to the building possible. The other essays discuss fine-tuning and
blocks that enable their breakthrough performance on applicability of LLMs to small datasets, and the
benchmarks and interactive use cases. Contributions capabilities of LLMs with respect to the ability to
focused primarily on the scientific merits of models, perform reasoning. As such the top essays cover diverse
largely sidelining discussion of practicalities like and interesting topics of LLMs.
hardware and the productionization of models. The
focus on LLMs is not surprising, given they are currently
in the center of society's attention when it comes to the
applicability of artificial intelligence solving human
problems.
AI Report 2023 26
Text Data Essay #1
Link to Notebook
AI Report 2023 27
Text Data Essay #2
This essay focuses on insights gained over the past two By analyzing related papers and their findings, they
years of working and experimenting with LLM reasoning. provide a comprehensive overview of the architecture
After giving an overview of different types of reasoning, and progress of LLM reasoning. They put special
which are important to understand in tackling reasoning emphasis on highlighting the lessons learned and the
challenges, the authors explore the advancements made way forward in ensuring ethically safe and reliable AI
through Chain of Thought prompting, Tree of Thought systems.
frameworks, linguistic feedback reinforcement,
interleaved reasoning and action, and handling complex
mathematical reasoning.
Link to Notebook
AI Report 2023 28
Text Data Essay #3
With giant LLMs becoming expensive to train and presents a comparative study of small language models,
prohibitively large to finetune for individuals or small and a brief discussion of evaluation methods. The
companies, small language models are flourishing and authors discuss the application scenarios where small
becoming more and more competent. The authors call language models are most needed in the real world, and
them "mini-giants" and argue a win-win for the open conclude with discussion and outlook.
source community by focusing on small language
models.
Link to Notebook
AI Report 2023 29
03
Image /
Video Data
AI Report 2023 30
Image / Video Data
Topic Summary
This section explores some of the latest advancements In addition to discussing model architectures, this
in computer vision, particularly as it relates to the use section covers standard practices used in
of image and video data. While the field of computer computer vision today like preprocessing and
vision dates back to the 1960s, its evolution has been augmentation techniques. Additionally, this section
especially exciting over the last few decades. explores the practical applications of these
Specifically, over the past two years, there have been technologies across industries like healthcare (for
significant advancements not only in traditional medical imaging), agriculture (for crop monitoring),
computer vision tasks like classification and object and the automotive sector (for self-driving cars). It
detection, but also in emerging areas such as Vision also covers some of the limitations that computer
Transformers (ViT) and few-shot learning. vision still faces.
AI Report 2023 31
Image / Video Data
Trends & Predictions and adapted to solve specific tasks. Looking ahead, I
anticipate that within the next five years, tasks such as
Computer vision can be traced back to the 1950s and segmentation, object detection, and image classification
1960s, when researchers started the development of will continue to be addressed by generalized large-scale
algorithms for detecting edges and patterns in images. models that can be fine-tuned for specific challenges,
While improvements in these areas continue, new mirroring what we’ve seen occur with large language
challenges like object detection, self-supervised models. We will also see the intersection of computer
learning, and knowledge reasoning are being vision and generative AI continue to progress in areas like
addressed with innovative model architectures and augmented reality, deep fakes, and AI-enhanced photo
training techniques. Particularly, video data is and video editing. These advancements inevitably bring
witnessing developments in multi-object tracking, with them ethical and philosophical considerations that
action recognition, and spatiotemporal reasoning. A require our attention.
noticeable area of active research in the field revolves
around generalization and transformer-based It’s also important to note some of the limitations that
architectures. research has yet to overcome. Some of these areas
include the limitation of multi-modal models that
The popularity of models like Segment Anything Model incorporate image and video, as well as vision model’s
(SAM) and YOLO (You Only Look Once) showcase how ability to perform in uncontrolled environments, like
generalized, open source models can be leveraged self-driving cars on new roads.
AI Report 2023 32
Image / Video Data
Overview of Essays Lastly, Dron Bespilotnik's essay “Image and Video Data |
Kaggle AI Report'' covers some of the practical areas of
The essays chosen for this section cover the working with image and video data, including
significant research advancements in image and video preprocessing techniques, and discussing how data
understanding in recent years. They reference augmentation enhances the training performance of
breakthrough papers that have reshaped our vision models. He describes the role of computer vision
understanding of machine learning and the types of in tasks like image classification, segmentation, and
tasks that computer vision is able to solve. question answering.
AI Report 2023 33
Image / Video Data Essay #1
Dron’s report highlights the growing trend in image and video Next, the evolution of image classification and segmentation
data usage. Specifically, he points to the increase in papers models are covered, highlighting the rise of architectures like
published in the field of computer vision and how the number of Visual Transformers and ConvNeXt, as well as their use with
papers released has continued to increase each year following natural language processing for visual comprehension. An
the groundbreaking AlexNet publishing in 2012. The report then exploration of top papers from 2021 and 2022 to the Conference
covers 5 areas of computer vision: Data Preprocessing, computer on Computer Vision and Pattern Recognition (CVPR) conference
vision applications, an analysis of CVPR conference submissions, further shows that transformers and ViT topics are some of the
CV use cases, and future perspectives for computer vision. hottest topics in the field.
Data preprocessing and augmentation are discussed as essential In the final sections, the report discusses real-life computer
components of computer vision pipelines. The use of a vision applications and its foreseeable challenges.
combination of real and synthetic data sources have been shown
to enhance image and video processing in competitions and
research.
Link to Notebook
AI Report 2023 35
04
Tabular / Time
Series Data
AI Report 2023 36
Tabular / Time Series Data
Topic Summary
The temporal nature of those data points becomes a
major underlying feature of time-series datasets,
Tabular data, in the form of transactional data and
requiring special considerations in analysis and
records of exchange and trade, has existed since the
modeling.
dawn of writing. It may even precede written language.
In most organizations, it is the most commonly used
Tabular data, and to much lesser extent time-series
form of data. There is no definitive measure, but it is
data, has proven largely impervious to the deep learning
estimated that between 50% and 90% of practicing data
revolution. Non-neural-network-based ML techniques
scientists use tabular data as their primary type of data
and tools are still widely used and have stood the test of
in their professional setting.
time. Nonetheless, there have been some interesting
recent developments on that front as well. This remains
Time series data is, in many respects, similar to tabular
a kind of data where a wide variety of tools and
data. It is often used to encode the same kinds of
techniques are relevant, and there exists tremendous
transactions as non-temporal tabular data, with one
potential for further research and improvement.
important distinction: inclusion of temporal information.
AI Report 2023 37
Tabular / Time Series Data
Trends & Predictions The dominance of gradient boosted trees has, until
recently, not been substantially covered in the ML
There are three main trends with machine learning for research literature, but over the past few years there
tabular data: have been more attempts to understand this
1. Need for unique approaches for every phenomenon and evaluate alternative approaches. In
dataset/problem. particular, there have been many more attempts to use
2. Outsized importance of data munging and feature neural networks with tabular data, but those attempts –
engineering. even when successful – come at the added expense of
3. Continuing dominance of gradient boosted trees computational complexity. Most of that research has
as the algorithm of choice. had minimal effect on applied ML modeling for tabular
The first two trends have been well known in the Kaggle data.
circles ever since the platform launched, and the last
since XGBoost was first introduced on Kaggle in 2014 It is very likely that, in the upcoming years, we will see
(you can find a collection of winning Kaggle solutions even more research on neural networks for tabular data,
that use XGBoost here). Even though there have been as well as new innovations for gradient boosted trees. A
comparatively few featured tabular data competitions on very promising new area would be the application of
Kaggle over the past couple of years, the ones that were generative AI to automated feature engineering and
held there reinforced these trends. AutoML for tabular data in general.
AI Report 2023 38
Tabular / Time Series Data
AI Report 2023 39
Tabular / Time Series Data Essay #1
Link to Notebook
AI Report 2023 40
Tabular / Time Series Data Essay #2
Link to Notebook
AI Report 2023 41
Tabular / Time Series Data Essay #3
The purpose of this report is to provide an overview of the Even though the essay doesn’t cover experience with Kaggle
recent advancements in AI techniques for tabular data. Our competitions and tabular datasets, it provides an interesting
goal is to provide valuable insights to the Kaggle data science and unique overview of some of the most advanced recent
community and inspire future innovations. By grasping the research developments in this area. The essay is particularly
latest AI techniques and their practical applications in tabular noteworthy for its overviews of AutoML and explainable AI,
data analysis, data professionals can remain at the forefront both of which are very important for many everyday
of this rapidly evolving field. applications.
Link to Notebook
AI Report 2023 42
05
Kaggle
Competitions
AI Report 2023 43
Kaggle Competitions
Topic Summary The real value of competitions emerges over time, where
one can observe winning solutions from older competitions
Kaggle competitions are the most meritocratic avenue for becoming new baselines. Eventually the “tricks” of winners
enthusiasts and veterans alike to establish their AI become standard practice in the next competitions: ideas
credentials. The leaderboard is based on an objective like pseudo labeling, seed averaging, hill climbing - to name
measurement to provide a score for each submission and just a few - were “tricks” that went from being explicitly
therefore does not lie. It is regarded by many as one of the mentioned in winning solutions to now frequently
hardest and truest challenges in data science. appearing in many solutions.
AI Report 2023 44
Kaggle Competitions
AI Report 2023 45
Kaggle Competitions
Overview of Essays
AI Report 2023 46
Kaggle Competitions Essay #1
Kaggle is usually known as an “ensembling playground” performance and inference time to diminish the carbon
to the outside world: Kaggle competitors often combine footprint of deep learning models.
a variety of methods and models in order to increase
their score without needing to balance the The report "Towards Green AI" shines a light on the
computational costs of their solutions. To counter this pivotal challenge of our times: crafting deep learning
trend, Kaggle has been awarding special prizes to models that deliver on performance without exacting a
solutions that are both accurate and performant. This heavy carbon toll. Driven by Kaggle's visionary
report shares learnings from Kaggle competitions Efficiency Prize, the study delves deep into the heart of
concerning efficient models and efficient modeling techniques like pruning, low-rank factorization, and
practices, in particular. quantization, scrutinizing their true potential in the
real-world AI landscape.
In this essay the author tells the story of the significance
of striking a balance between predictive
Link to Notebook
AI Report 2023 47
Kaggle Competitions Essay #2
This essay dives deep into the minds of Kaggle winners, From the nuances of data augmentation to the might of
and the author uses LLMs to systematically extract and gradient boosted decision trees, this report paints a
analyze structured data from a myriad of Kaggle comprehensive picture of what it takes to clinch that
competition writeups (dedicated discussion posts where coveted top spot. A brilliant blend of data-driven
winners of Kaggle competitions describe their solutions). insights and personal experiences, this essay is an
essential guide for anyone interested in climbing the
It distills wisdom and ideas from the most coveted leaderboard.
methods and strategies.
Link to Notebook
AI Report 2023 48
Kaggle Competitions Essay #3
Link to Notebook
AI Report 2023 49
06
AI Ethics
AI Report 2023 50
AI Ethics
AI Report 2023 51
AI Ethics
Trends & Predictions Several emerging trends in this field warrant attention. One
such trend is the continuous auditing of AI systems,
The study of AI ethics is not merely an academic endeavor especially those in critical sectors. Such systems might
but a societal imperative. As AI continues to shape the undergo ethical audits to ensure they adhere to
world, ensuring its ethical deployment becomes paramount established guidelines. It’s anticipated that future AI
to harness its benefits while safeguarding collective values. systems will adopt an ‘ethics-by-design’ approach,
A growing consensus has emerged that ethics cannot be emphasizing ethical considerations from the onset. For
an afterthought; instead, they must be an integral part of AI instance, Meta decided to approach the release of
system design. However, there’s a palpable need for LLaMA-2 with strong focus on responsibility, providing
globally accepted standards on AI ethics. Some resources and best practices for responsible development
authoritative guidance is appearing to emerge, like the of products powered by large language models.
ISO’s technical standards for AI and the NIST (National
Institute for Standards and Technology) AI Risk Moreover, as AI becomes more ingrained in daily life,
Management Framework. This framework highlights key there’s an expected increase in public involvement. People
characteristics of trustworthy AI systems, including are likely to have a more pronounced role in AI’s ethical
validity, reliability, safety, security, resiliency, transparency, considerations, leading to more public forums,
accountability, explainability, interpretability, bias consultations, and potential referendums on significant AI
management, and enhanced privacy. To implement these deployments. In essence, the landscape of AI adoption and
characteristics, it offers actionable guidance for risk management is continuously changing, making it
organizations in four areas: map, measure, manage, and crucial to stay vigilant and proactive when addressing the
govern. ethical ramifications of these technologies.
AI Report 2023 52
AI Ethics
Overview of Essays
The AI Ethics section garnered numerous insightful Others focus on more specific areas. For example, some
submissions, highlighting how rapidly this area is discuss the latest advancements in AI, like the exciting
evolving. The essays received covered a diverse set of world of generative AI, and the ethical questions these
topics, reflecting the depth and breadth of this new technologies bring. Others take a closer look at
important discipline. research in the AI world, analyzing the ethical
considerations in AI studies and publications.
Some essays covered the big picture of AI ethics, But despite their different angles, all the essays
discussing the broader challenges and opportunities it converge on a shared sentiment: AI holds immense
presents. They touch upon how we need to shape AI in a promise for our future, but we need to approach its
way that is responsible and considers everyone in development and use with care, thought, and a strong
society, from ensuring fairness in its decisions to sense of responsibility.
thinking about its impact on the environment.
AI Report 2023 53
AI Ethics Essay #1
This winning essay delves into the critical urgency of Notably, the essay offers a dual lens, analyzing each
AI’s ethical implications in our ever-evolving digital era. guideline from societal impacts to technical
The essay talks about central principles, crucial for advancements. Each guideline is dissected not only for
instilling trust in AI, including privacy, data protection, its societal importance and value, but also for the
transparency, explainability, fairness, accountability, technical innovations that aid its implementation. In
safety, robustness, and even environmental doing so, the essay bridges the perceived gap between
considerations. This essay offers a comprehensive ethical principles and actionable technological solutions.
exploration of each principle, presenting contemporary Through this comprehensive approach, the essay makes
efforts and strategies to seamlessly integrate them a compelling case for these principles as the foundation
within AI’s lifecycle. for fostering genuine trust amongst all stakeholders
throughout the AI system lifecycle.
Link to Notebook
AI Report 2023 54
AI Ethics Essay #2
The essay covers the evolution of AI ethics over the past Designed as an introductory piece rather than a deep
two pivotal years, highlighting enduring challenges and dive, the essay seeks to amplify awareness of AI ethics
the continuous efforts to address them. During this and champions a future where AI is more inclusive,
period, the AI domain has witnessed significant growth beneficial, and ethically grounded.
and heightened interest in its capabilities. This rapid
advancement, however, has accentuated the critical
need for robust AI ethics.
Link to Notebook
AI Report 2023 55
AI Ethics Essay #3
In this essay, the authors underscore the essentiality of examples, the authors emphasize the tangible
ethics for building trust in AI systems and fostering implications of AI systems, aiming to spotlight real-world
sustainable advancement in AI research. The paper sets ethical challenges and considerations.
out to investigate and assess the ethical dimensions
embedded in research publications, focusing specifically
on AI articles from 2007 to 2023.
Link to Notebook
AI Report 2023 56
07
Other Topics
AI Report 2023 57
Other Topics
Topic Summary With one major exception, the topics of the essays have
little in common with one another. Participants analyze
The rapid growth that the field of Machine Learning has such diverse topics as optimization algorithms, graph
been experiencing over the recent years has led to an networks, theoretical physics, robotics, healthcare, the
explosion of tools, techniques, and applications. In our future of work, data-centric AI, mathematical research,
final category - simply called “Other” - we cover those autonomous cars, and many more. Together, they offer a
aspects of the ML landscape that may not fit neatly into panoramic snapshot of the wide-ranging influence of
any of the previous, more specific categories. This long Machine Learning at this pivotal moment in time.
tail of topics is reflected in a wide variety of topics
chosen by the essayists. In turn, the multitude of
fascinating subjects showcases the exceptional range
and diversity of expertise within the Kaggle community.
AI Report 2023 58
Other Topics
From humble beginnings of applying computer vision Another exciting trend are multi-modal ML applications
tools in assisting with diagnosis in radiology or powered by growing repositories of pretrained models.
tomography, the power of ML to save lives and cure This is relevant for generative AI applications utilizing
diseases is being increasingly leveraged by medical agent systems, but also for tailored projects that draw
professionals and researchers. Modern applications on diverse data sources. In the coming years, models
include, among others, genetic sequencing (with NLP and applications that can derive insights from
techniques), robotic surgery, medical teaching, drug multi-sensory inputs in a human-like manner will likely
discovery, and of course the extremely relevant area of play an increasingly important role.
vaccine research. This trend is also reflected in a large
number of healthcare-related Kaggle competitions in
recent years,
AI Report 2023 59
Other Topics
AI Report 2023 60
Other Topics Essay #1
Link to Notebook
AI Report 2023 64
Other Topics Essay #5
AI Report 2023 66
In this effort, our community wrote hundreds of essays covering a broad array of topics,
and then experts from our community selected the best. The result is a collective
perspective on the rapid advancements of AI, shedding light on the most salient topics in
modern machine learning.
Many of the essays discussed the recent progress and potential of generative AI to
Conclusion revolutionize multiple industries and massively broaden access, making it the front runner
for shaping the future of AI. Ethics and “green” AI will command even more attention as
Phil Culliton scale, audience, and capabilities expand. Making sense of this future will require
validation through competitions, improved benchmarks, and other tools capable of
incorporating diverse and representative feedback.
We are excited by the potential of this report to highlight insights and advancements in
our field, as told through the collective mindshare of the Kaggle community. As machine
learning becomes more accessible, sampling from the diverse opinions of its practitioners
is an effective strategy to understand an ever-evolving discipline.
AI Report 2023 67
We extend a tremendous thank you to the people
that made this report possible, including:
Credits
Karnika Kapoor - Area Chair
Kinjal Parekh - Coordinator
Mark McDonald - Copyeditor
Martin Henze - Area Chair
Parul Pandey - Area Chair
Paul Mooney - First author
Phil Culliton - First author
Raphael Kerley - Design
Rob Mulla - Area Chair
Sanyam Bhutani - Area Chair
Sara Wolley - Coordinator
Siddhita Upare - Illustrations
Will Cukierski - Copyeditor
AI Report 2023 68
Paul Mooney*, Phil Culliton*, Abir Eltaief, Ali Jalali, Antong
C., Anya Mathur, Arya Gaikwad, Bojan Tunguz, Christof
Henkel, Chuandong Tang, Danial Sultanov, Dariusz
Kleczek, Dave Harold Mbiazi Njanda, Diego Flores, Dmitri
Kalinin, Harshit Mishra, Hoda Jalali Najafabadi, Julia Elliott,
Ivaxi Sheth, Karnika Kapoor, Kobbie Manrique, Leonie
Monigatti, Lezhi Li, Lorresprz, Mark McDonald, Martin
Henze, Maryam Babaei, Meghana Bhange, Nghi Huynh,
Citation Parul Pandey, Patrik Joslin Kenfack, Paulina Skorupska,
Piyush Mathur, Pranav Mohan Belhekar, Raghav Awasthi,
Rhys Cook, Rob Mulla, Samantha Lycett, Sanyam Bhutani,
Shreya Mishra, Svetlana Nosova, Theo Flaus, Trushant
Kalyanpur, Will Cukierski, Xinxi Chen, Yassine Motie, Yuqi
Liu, Yuxi Li, Zhengping Zhou,
D. Sculley.
AI Report 2023 70
AI Report 2023 71