LLM_JobSkills
LLM_JobSkills
This paper presents the process of developing a tool for analyzing expected skills in job ads
for software developer positions, and to highlight the frequency of skills related to large
language models (LLM). The artifact uses a number of APIs to retrieve job postings from
different sources and then uses a LLM to identify skills within the job descriptions and their
relevancy to the use of LLMs. Using the data from the LLM, the system should create
graphs to visualize the trends for skills in job listings. The results of the project show that the
implementation was successful in the ability to get job postings, find keywords and present
the data visually via graphs. For future work, expanding on the data analysis portion of the
software in order to provide a more in-depth overview of the changes in demand for specific
skills over time, for example.
Keywords: software development, LLM, AI, python, market trends, skills, job qualifications
Table of Contents
Table of Tables ............................................................................ 6
Table of Figures .......................................................................... 7
Terminology ................................................................................ 8
1 Introduction ........................................................................... 9
1.1 Problem Statement ................................................................................................. 9
1.7 Outline................................................................................................................... 12
2 Methodology ....................................................................... 12
2.1 Requirements ....................................................................................................... 13
3 Implementation ................................................................... 14
3.1 GUI and Main Class.............................................................................................. 15
4 Results ............................................................................... 18
5 Analysis and Discussion ..................................................... 19
References ............................................................................... 19
Appendix................................................................................... 21
5.1 Appendix A: functional requirements .................................................................... 21
1 The time plan showing each activity performed during the course.
6
Table of Figures
Figure # Description
2 The process that the program will use to gather and parse data.
7
Terminology
AI Artificial Intelligence
8
1 Introduction
1.1 Problem Statement
There are many programming languages and specializations within the field of software
development, and with the recent emergence of LLMs capable of increasing productivity [1]
by acting as code assistants for example. However, as a job applicant or market researcher,
it can be difficult to gain an understanding of what skills are expected of software developers
by hiring companies, especially skills related to the use of LLMs. There are many tools
available for conducting market research by analyzing job postings, yet there aren’t any that
analyze the impact of LLMs in terms of desired skills. This project aimed to create an easy-
to-use open-source tool for retrieving data from job postings and to analyze what skills are
currently in demand.
9
With the introduction of LLMs accessible LLMs such as OpenAIs GPT line, Claude or the
more recent DeepSeek models, the job market has been affected as reportet by indeeds
Hiring Lab [2]. Their analysis shows a meteoric rise in American job listings for generative AI
jobs starting in 2024, and although the last recorded value of 0.01%, this sudden trend
upwards warrants attention and further analysis.
This coincides with the introduction of new roles such as ”Prompt Engineer” and ”Generative
AI Engineer” being introduced to the market. Listings for positions such as these doubled in
the second quarter of 2023, making ”Generative AI Engineer” the third fastest growing job
title on LinkedIn according to Business Insider [3].
This sudden interest in growth for jobs using LLMs and not just working on the creation of
LLMs marks the beginning of the incorporation of Generative AI on the job market and to
keep these trends under watch, tools and further study is necessary, to allow researchers
and academics to analyze and for job-seekers to be able to keep up with the changing
market. It is therefore important for tools that can analyze and map trends like this on the
job-market to exist and grow.
1.3 Aim
This project aims to develop a tool that is capable of gathering job postings using a number
of APIs, to then use an LLM to identify qualifications, and whether they are related to large
10
language models. This program can be used to get a graphical display of skills using
graphs.
Figure 2. The process that the program will use to gather and parse data.
1.4 Scope
The project artifact will have a GUI, this window will contain a form where users can specify
a timeframe, location and what type of graph that the data will be shown in. The types of
graphs that will be available are pie charts, bar diagrams and line graphs. In the pie chart, it
will display the percentage of LLM-related skills out of all skills gathered. The bar graph will
show frequency of each skill found, highlighting the skills related to LLMs. For the line
graph, it will display the frequency over time.
11
The tool will use the NLP model ‘DeepSeek-V3’ to identify requirements from job
descriptions and to mark whether they are related to LLMs or not. Due to the limited time
allotted for this project, the ability for users to pick different models will not be implemented.
planning
SRS
construction
report
presentation
Table 1. The time plan showing each activity performed during the course.
1.7 Outline
This report is composed of the sections: introduction, methodology, implementation, results
and discussion.
Introduction: Provides background information about the topic of this report, as well as
defining a problem statement and aim of the study.
Methodology: Gives insight on the methods used to construct the project artifact.
Discussion: Evaluation of the work, whether the results align with the stated goals and
requirements.
2 Methodology
12
2.1 Requirements
The functionalities that this software needed to implement were split into functional and non-
functional requirements, the full description for each requirement is shown in appendix A
and B.
The strategy used for version control in GIT was to utilize feature branches for developing
new components. When done with the work, a pull request to the main branch was created
with another team member set as reviewer, that evaluated whether to merge the branch or
not.
• Python
• PyCharm
• GitHub
• PyQt6
• OpenAI
• Matplotlib
• NumPy
• Pandas
The program was programmed using Python, using the IntelliJ IDE PyCharm. Version
Control was used to sync work for the team, this was achieved using a shared GitHub
repository.
The GUI for the application was done using PyQt6 and various widgets to build up the
different fields, buttons and radios. For the LLM connection the OpenAI library was
implemented to connect and send queries to LLM models.
To visualize the data NumPy, Pandas and MatplotLib was used to turn JSON data into
graphs showcasing the data in different views.
3 Implementation
The system contains four key components to get input from the user, make API requests to
get job listings, identify skills using an LLM and using that data to create graphs.
14
3.1 GUI and Main Class
The ‘main’ class creates a window by extending the QMainWindow class in the PyQt6
library. Using a QFormLayout, additional graphical components are added to the main
window vertically. A listener is attached to the ‘run’ button to capture click events, so that a
function is called whenever the form is to be submitted. This function is responsible for
sending data to the different key components and passing their respective responses to the
next component in line (see figure 2).
Before the array of JSON objects retrieved from the API service is passed to the LLM for
identifying expected skills of job applicants, the headers ‘description’, ‘date published’ and
‘job ID’ are extracted. The job descriptions are picked out to reduce the overall size of the
request made to the LLM API. The publication date is selected in order to associate each
identified keyword with a date, so that the data analysis class can show the changes over
time. The internal ID of each listing is also picked out to filter out duplicate responses from
the job listing APIs which may occur and thereby make the final results misleading.
Name Description
Platsbanken - jobsearch Gets data from currently listed job ads from the Swedish
public employment service.
Platsbanken – historical ads Gets data from historical ads from the Swedish public
employment service.
Job Posting API Gathers data from employers’ career websites from the past
six months.
Table 4. Lists and describes the APIs used to gather job advertisements.
The initial setup was built to interface with a local instance of KoboldCPP a program which
can run various local LLM models, this was done by making api calls to a changeable
localhost address, using this API call with custom instructions and parameters responses
15
were received but models capable of carrying out the task were not runnable on the groups
Graphics cards at an acceptable speed.
To address this issue, the group pivoted to remote LLM services and settled on DeepSeek
which offered cheap, non-rate limited API access which could handle thousands of calls for
barely one dollar.
With the API connection set-up, the responses sent back were well formatted and consistent
and could be used for the data analysis, but the processing speed was not acceptable for
the speed the group was aiming for, so after noting the APIs explicit lack of a rate limit,
concurrent API calls were implemented using multiple threads and futures, allowing for a
large boost in speeds with the LLM being sent 10x the amount of data at a time and
returning them equally fast. The returned data was gathered and sent back for final
processing before being sent into the data analysis component.
16
In the pie chart, the number of skills related to LLMs are compared to the number of
non-related skills. This is done by dividing the list of all skills into two lists based on whether
they are related to LLMs or not. By counting the number of entries in each respective list
and adding the two values to a data frame, a pie chart can be printed.
The line graph is meant to show the frequency of specific skills in job ads over time, each
line in the graph represents one skill, and skills related to LLMs are highlighted in straight
lines, whereas non-related skills are dotted.
The bar diagram shows the frequency of every keyword by representing each skill as a
vertical bar. If the skill is related to LLMs it is colored red, while non-related skills are blue.
17
4 Results
After running the program, the following graphs were created in separate occasions using
an empty ‘location’ parameter and specifying the time span to be within 2016-01-01 to 2025-
03-20.
18
Figure 6. The pie diagram created by the program.
The limitations present in the tool are, for example, the API to the DeepSeek LLM, which
can act as a bottleneck when a large quantity of data is meant to be processed, significantly
slowing down the execution of the software. Another limitation related to the LLM is the
intelligence of the chatbot, which affects the way skills are classified, whether they are
related to LLMs or not. This affects the credibility of the results when presented later in the
graphs.
The keywords retrieved by the LLM can be further specified in the instructions and be
further narrowed down, with more testing of instructions, it might be possible to guide the
LLM to send back a more narrowed scope of relevant keywords for analysis.
The selection of graphs available was chosen based on the type of information the tool was
to display. For visualizing changes over time, the occurrences of each skill, and the
percentage of skills related to LLMs in comparison to non-related skills, the use of line, bar
and pie graphs were suitable for their respective purposes.
In previous studies that investigated what qualifications were present in job ads such as [4],
this study used a text network method to identify keywords, where the thickness of the
edges between each node represents the frequency of a keyword. The methodology that
was employed by Kutela et. Al. [4] was able to identify frequently used words in job postings
for prompt engineering positions. However, the results also contained a number of unrelated
words, such as “best”, “various” and “can”. While this approach could be more suitable for
analyzing a much greater quantity of job descriptions than the one described in this paper,
by using a LLM to identify skills, this study was able to more accurately find trends in job
listings.
References
19
[1] F. Song, A. Agarwal, and W. Wen, “The Impact of Generative AI on Collaborative Open-Source
Software Development: Evidence from GitHub Copilot,” SSRN Journal, 2024, doi:
10.2139/ssrn.4856935.
[2] N. Bunker, ‘March 2024 US Labor Market Update: AI Jobs Are on the Rebound’, Indeed Hiring Lab.
Accessed: Mar. 20, 2025. [Online]. Available: https://www.hiringlab.org/2024/03/14/march-2024-us-
labor-market-update/
[3] J. Zinkula, ‘Job seekers are scrambling to add new AI skills to their LinkedIn profiles as postings
mentioning ChatGPT surge’, Business Insider. Accessed: Mar. 20, 2025. [Online]. Available:
https://www.businessinsider.com/ai-job-descriptions-add-chatgpt-skills-to-linkedin-profiles-2023-8
[4] B. Kutela, N. Novat, N. Novat, J. Herman, A. Kinero, and S. Lyimo, “Artificial Intelligence (AI) and
Job Creation: An Exploration of the Nature of the Jobs, Qualifications, and Compensations of Prompt
Engineers,” Nov. 06, 2023, Social Science Research Network, Rochester, NY: 4625139. doi:
10.2139/ssrn.4625139.
20
Appendix
5.1 Appendix A: functional requirements
ID FR1
DESC Users need to be able to specify a country/city/location when searching for job
listings.
RAT In order to find what qualifications are needed for jobs close to the user.
DEP None.
ID FR2
DESC Researchers need to be able to specify what type of graph the data is shown in.
RAT Depending on the point wanting to be illustrated, different graphs can convey
and display data differently.
DEP None.
ID FR3
DESC The program should retrieve relevant job listing data from external job
aggregators using API calls.
RAT Gathering data is crucial for analyzing trends and a fundamental part of the
usability of the program.
DEP Depends on availability of API and configuration and available filtration criteria.
ID FR4
21
DESC The program should be capable of listing trends over a period decided by the
user.
RAT For analysis and relevant graphs to be created a wider perspective such as
seeing the trends evolving over time is needed.
ID QR1
RAT So that the user won’t have to wait too long for the program to finish.
DEP None.
ID QR2
DESC The program should have an interactable GUI through which the setup for the
gathering can be done.
RAT Not having a GUI and limiting functionality to something like a Command-line
interface is unfriendly and less intuitive to unfamiliar users.
22