0% found this document useful (0 votes)

6 views

LLM_JobSkills

This paper describes the development of a tool for analyzing job ads for software developer positions, focusing on skills related to large language models (LLMs). The tool successfully retrieves job postings, identifies relevant skills using an LLM, and visualizes the data through graphs. Future work aims to enhance data analysis for a deeper understanding of skill demand trends over time.

Uploaded by

muhammadhashir.raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

LLM_JobSkills

Uploaded by

muhammadhashir.raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Abstract

This paper presents the process of developing a tool for analyzing expected skills in job ads
for software developer positions, and to highlight the frequency of skills related to large
language models (LLM). The artifact uses a number of APIs to retrieve job postings from
different sources and then uses a LLM to identify skills within the job descriptions and their
relevancy to the use of LLMs. Using the data from the LLM, the system should create
graphs to visualize the trends for skills in job listings. The results of the project show that the
implementation was successful in the ability to get job postings, find keywords and present
the data visually via graphs. For future work, expanding on the data analysis portion of the
software in order to provide a more in-depth overview of the changes in demand for specific
skills over time, for example.

Keywords: software development, LLM, AI, python, market trends, skills, job qualifications
Table of Contents
Table of Tables ............................................................................ 6
Table of Figures .......................................................................... 7
Terminology ................................................................................ 8
1 Introduction ........................................................................... 9
1.1 Problem Statement ................................................................................................. 9

1.2 Background and Motivation .................................................................................... 9

1.3 Aim ........................................................................................................................ 10

1.4 Scope .................................................................................................................... 11

1.5 Completeness Criteria .......................................................................................... 12

1.6 Time Plan .............................................................................................................. 12

1.7 Outline................................................................................................................... 12

2 Methodology ....................................................................... 12
2.1 Requirements ....................................................................................................... 13

2.1.1 Functional requirements ................................................................................... 13

2.1.2 Non-functional requirements ............................................................................ 13
2.2 Development Method and Workflow..................................................................... 13

2.3 Tools and Architecture .......................................................................................... 13

3 Implementation ................................................................... 14
3.1 GUI and Main Class.............................................................................................. 15

3.2 API Service component ........................................................................................ 15

3.3 LLM Communication component .......................................................................... 15

3.4 Data Analysis component ..................................................................................... 16

4 Results ............................................................................... 18
5 Analysis and Discussion ..................................................... 19
References ............................................................................... 19
Appendix................................................................................... 21
5.1 Appendix A: functional requirements .................................................................... 21

5.2 Appendix B: Non-functional requirements ............................................................ 22

Table of Tables
Table # Description

1 The time plan showing each activity performed during the course.

2 Shows the functional requirements for the project artifact.

3 Shows the non-functional requirements for the project artifact.

4 Lists and describes the APIs used to gather job advertisements.

6
Table of Figures
Figure # Description

1 Indeeds graph showcasing the rise of GenAI jobs.

2 The process that the program will use to gather and parse data.

3 Shows the execution flow of the system.

4 The line diagram created by the program.

5 The bar diagram created by the program.

6 The pie diagram created by the program.

7
Terminology

API Application Programming Interface

GUI Graphical User Interface

LLM Large Language Model

AI Artificial Intelligence

NLP Natural Language Processing

8
1 Introduction
1.1 Problem Statement
There are many programming languages and specializations within the field of software
development, and with the recent emergence of LLMs capable of increasing productivity [1]
by acting as code assistants for example. However, as a job applicant or market researcher,
it can be difficult to gain an understanding of what skills are expected of software developers
by hiring companies, especially skills related to the use of LLMs. There are many tools
available for conducting market research by analyzing job postings, yet there aren’t any that
analyze the impact of LLMs in terms of desired skills. This project aimed to create an easy-
to-use open-source tool for retrieving data from job postings and to analyze what skills are
currently in demand.

1.2 Background and Motivation

9
With the introduction of LLMs accessible LLMs such as OpenAIs GPT line, Claude or the
more recent DeepSeek models, the job market has been affected as reportet by indeeds
Hiring Lab [2]. Their analysis shows a meteoric rise in American job listings for generative AI
jobs starting in 2024, and although the last recorded value of 0.01%, this sudden trend
upwards warrants attention and further analysis.

Figure 1. Indeeds graph showcasing the rise of GenAI jobs.

This coincides with the introduction of new roles such as ”Prompt Engineer” and ”Generative
AI Engineer” being introduced to the market. Listings for positions such as these doubled in
the second quarter of 2023, making ”Generative AI Engineer” the third fastest growing job
title on LinkedIn according to Business Insider [3].

This sudden interest in growth for jobs using LLMs and not just working on the creation of
LLMs marks the beginning of the incorporation of Generative AI on the job market and to
keep these trends under watch, tools and further study is necessary, to allow researchers
and academics to analyze and for job-seekers to be able to keep up with the changing
market. It is therefore important for tools that can analyze and map trends like this on the
job-market to exist and grow.

1.3 Aim
This project aims to develop a tool that is capable of gathering job postings using a number
of APIs, to then use an LLM to identify qualifications, and whether they are related to large

10
language models. This program can be used to get a graphical display of skills using
graphs.

Figure 2. The process that the program will use to gather and parse data.

1.4 Scope
The project artifact will have a GUI, this window will contain a form where users can specify
a timeframe, location and what type of graph that the data will be shown in. The types of
graphs that will be available are pie charts, bar diagrams and line graphs. In the pie chart, it
will display the percentage of LLM-related skills out of all skills gathered. The bar graph will
show frequency of each skill found, highlighting the skills related to LLMs. For the line
graph, it will display the frequency over time.

11
The tool will use the NLP model ‘DeepSeek-V3’ to identify requirements from job
descriptions and to mark whether they are related to LLMs or not. Due to the limited time
allotted for this project, the ability for users to pick different models will not be implemented.

1.5 Completeness Criteria

A working tool which accepts inputs from users to use as parameters for gathering job
listings from APIs, the results are passed to an LLM which picks out skills from their
descriptions. The results will be presented as graphs in the GUI, that will also be saved as
images locally. The criteria for this tool are stated in section 2.1, and functional/non-
functional requirements are listed in the document appendix.

1.6 Time Plan

Week 4 5 6 7 8 9 10 11 12

planning

SRS

construction

report

presentation

Table 1. The time plan showing each activity performed during the course.

1.7 Outline
This report is composed of the sections: introduction, methodology, implementation, results
and discussion.

Introduction: Provides background information about the topic of this report, as well as
defining a problem statement and aim of the study.

Methodology: Gives insight on the methods used to construct the project artifact.

Implementation: Indepth description of how the tool was implemented.

Results: Describes the end-product of the project and its output.

Discussion: Evaluation of the work, whether the results align with the stated goals and
requirements.

2 Methodology
12
2.1 Requirements
The functionalities that this software needed to implement were split into functional and non-
functional requirements, the full description for each requirement is shown in appendix A
and B.

2.1.1 Functional requirements

ID Name

FR1 Specify location

FR2 Specify graph type

FR3 Retrieve job listings via API

FR4 Specify timescale

Table 2. Shows the functional requirements for the project artifact.

2.1.2 Non-functional requirements

ID Name

QR1 Response time

QR2 Graphical interface

Table 3. Shows the non-functional requirements for the project artifact.

2.2 Development Method and Workflow

The workflow used for this project was a waterfall model in order to complete the artifact
within the given timeframe, which meant that the requirements and structure had to be
outlined before starting to implement. As each component and functionality had been
outlined in the SRS document for the project, the software was developed sequentially
starting from the GUI to API service component, LLM module and lastly the data analysis
component. This workflow was preferred over other models such as agile, whose key
strength lies in being able to adapt to changing requirements, because the requirements for
the project are defined in the SRS. The drawback of agile development is the uncertain
completeness due to scope creep, the additional time spent on team meetings and meta
tasks of managing backlogs were key arguments for picking a waterfall model instead of
agile.

The strategy used for version control in GIT was to utilize feature branches for developing
new components. When done with the work, a pull request to the main branch was created
with another team member set as reviewer, that evaluated whether to merge the branch or
not.

2.3 Tools and Architecture

13
Development Tools:

• Python

• PyCharm

• GitHub

Frameworks and libraries:

• PyQt6

• OpenAI

• Matplotlib

• NumPy

• Pandas

The program was programmed using Python, using the IntelliJ IDE PyCharm. Version
Control was used to sync work for the team, this was achieved using a shared GitHub
repository.

The GUI for the application was done using PyQt6 and various widgets to build up the
different fields, buttons and radios. For the LLM connection the OpenAI library was
implemented to connect and send queries to LLM models.

To visualize the data NumPy, Pandas and MatplotLib was used to turn JSON data into
graphs showcasing the data in different views.

3 Implementation

Figure 3. Shows the execution flow of the system.

The system contains four key components to get input from the user, make API requests to
get job listings, identify skills using an LLM and using that data to create graphs.

14
3.1 GUI and Main Class
The ‘main’ class creates a window by extending the QMainWindow class in the PyQt6
library. Using a QFormLayout, additional graphical components are added to the main
window vertically. A listener is attached to the ‘run’ button to capture click events, so that a
function is called whenever the form is to be submitted. This function is responsible for
sending data to the different key components and passing their respective responses to the
next component in line (see figure 2).

Before the array of JSON objects retrieved from the API service is passed to the LLM for
identifying expected skills of job applicants, the headers ‘description’, ‘date published’ and
‘job ID’ are extracted. The job descriptions are picked out to reduce the overall size of the
request made to the LLM API. The publication date is selected in order to associate each
identified keyword with a date, so that the data analysis class can show the changes over
time. The internal ID of each listing is also picked out to filter out duplicate responses from
the job listing APIs which may occur and thereby make the final results misleading.

3.2 API Service component

The API Service class uses the grequests python library to make parallel HTTP requests to
the job posting APIs.

Name Description

Indeed Retrieves data from the job ad website ‘Indeed’.

Platsbanken - jobsearch Gets data from currently listed job ads from the Swedish
public employment service.

Platsbanken – historical ads Gets data from historical ads from the Swedish public
employment service.

Job Posting API Gathers data from employers’ career websites from the past
six months.

Table 4. Lists and describes the APIs used to gather job advertisements.

3.3 LLM Communication component

To gather keywords representing the requested competencies of a given job listing, a
connection to an LLM was to be used, this was to allow for the flexible natural language
understanding which would be able to gather keywords regardless of phrasing or layout.

The initial setup was built to interface with a local instance of KoboldCPP a program which
can run various local LLM models, this was done by making api calls to a changeable
localhost address, using this API call with custom instructions and parameters responses

15
were received but models capable of carrying out the task were not runnable on the groups
Graphics cards at an acceptable speed.

To address this issue, the group pivoted to remote LLM services and settled on DeepSeek
which offered cheap, non-rate limited API access which could handle thousands of calls for
barely one dollar.

With the API connection set-up, the responses sent back were well formatted and consistent
and could be used for the data analysis, but the processing speed was not acceptable for
the speed the group was aiming for, so after noting the APIs explicit lack of a rate limit,
concurrent API calls were implemented using multiple threads and futures, allowing for a
large boost in speeds with the LLM being sent 10x the amount of data at a time and
returning them equally fast. The returned data was gathered and sent back for final
processing before being sent into the data analysis component.

3.4 Data Analysis component

The purpose of the data analysis component is to parse the data from the two previous
components into a data frame using the pandas library. This library is built for manipulating
and analyzing data by providing different data structures and operations. To create graphs
from the given data frame, the component uses the library matplotlib to create three types of
graphs: pie charts, line graphs and bar diagrams.

16
In the pie chart, the number of skills related to LLMs are compared to the number of
non-related skills. This is done by dividing the list of all skills into two lists based on whether
they are related to LLMs or not. By counting the number of entries in each respective list
and adding the two values to a data frame, a pie chart can be printed.

The line graph is meant to show the frequency of specific skills in job ads over time, each
line in the graph represents one skill, and skills related to LLMs are highlighted in straight
lines, whereas non-related skills are dotted.

The bar diagram shows the frequency of every keyword by representing each skill as a
vertical bar. If the skill is related to LLMs it is colored red, while non-related skills are blue.

17
4 Results
After running the program, the following graphs were created in separate occasions using
an empty ‘location’ parameter and specifying the time span to be within 2016-01-01 to 2025-
03-20.

Figure 4. The line diagram created by the program.

Figure 5. The bar diagram created by the program.

18
Figure 6. The pie diagram created by the program.

5 Analysis and Discussion

The current iteration of the tool fulfills the stated functional requirements, it is able to gather
job postings from APIs, identify skills and use different types of graphs to present the data.
In terms of non-functional requirements, the GUI provides a visual interface for interacting
with the system, encapsulates the form used to modify the behavior of the tool and displays
the figures produced after executing a search. The non-functional requirement of response
time was implemented when making API calls both to the job posting APIs, as well as the
LLM by using multithreading to handle a number of requests at a time.

The limitations present in the tool are, for example, the API to the DeepSeek LLM, which
can act as a bottleneck when a large quantity of data is meant to be processed, significantly
slowing down the execution of the software. Another limitation related to the LLM is the
intelligence of the chatbot, which affects the way skills are classified, whether they are
related to LLMs or not. This affects the credibility of the results when presented later in the
graphs.

The keywords retrieved by the LLM can be further specified in the instructions and be
further narrowed down, with more testing of instructions, it might be possible to guide the
LLM to send back a more narrowed scope of relevant keywords for analysis.

The selection of graphs available was chosen based on the type of information the tool was
to display. For visualizing changes over time, the occurrences of each skill, and the
percentage of skills related to LLMs in comparison to non-related skills, the use of line, bar
and pie graphs were suitable for their respective purposes.
In previous studies that investigated what qualifications were present in job ads such as [4],
this study used a text network method to identify keywords, where the thickness of the
edges between each node represents the frequency of a keyword. The methodology that
was employed by Kutela et. Al. [4] was able to identify frequently used words in job postings
for prompt engineering positions. However, the results also contained a number of unrelated
words, such as “best”, “various” and “can”. While this approach could be more suitable for
analyzing a much greater quantity of job descriptions than the one described in this paper,
by using a LLM to identify skills, this study was able to more accurately find trends in job
listings.

References

19
[1] F. Song, A. Agarwal, and W. Wen, “The Impact of Generative AI on Collaborative Open-Source
Software Development: Evidence from GitHub Copilot,” SSRN Journal, 2024, doi:
10.2139/ssrn.4856935.

[2] N. Bunker, ‘March 2024 US Labor Market Update: AI Jobs Are on the Rebound’, Indeed Hiring Lab.
Accessed: Mar. 20, 2025. [Online]. Available: https://www.hiringlab.org/2024/03/14/march-2024-us-
labor-market-update/

[3] J. Zinkula, ‘Job seekers are scrambling to add new AI skills to their LinkedIn profiles as postings
mentioning ChatGPT surge’, Business Insider. Accessed: Mar. 20, 2025. [Online]. Available:
https://www.businessinsider.com/ai-job-descriptions-add-chatgpt-skills-to-linkedin-profiles-2023-8

[4] B. Kutela, N. Novat, N. Novat, J. Herman, A. Kinero, and S. Lyimo, “Artificial Intelligence (AI) and
Job Creation: An Exploration of the Nature of the Jobs, Qualifications, and Compensations of Prompt
Engineers,” Nov. 06, 2023, Social Science Research Network, Rochester, NY: 4625139. doi:
10.2139/ssrn.4625139.

20
Appendix
5.1 Appendix A: functional requirements
ID FR1