Benner T. Naming Things. The Hardest Problem in Software Engineering 2023
Benner T. Naming Things. The Hardest Problem in Software Engineering 2023
I. Introduction
1. Purpose of this book
2. Why is naming important?
2.1. Understandable code
2.2. Productivity and happiness
2.3. Career growth
3. Why is naming hard?
3.1. Dynamic, subjective requirements
3.2. Insufficient best practices and tooling
3.3. Short-term costs and long-term value
3.4. How can we make naming easier?
4. Scope of this book
II. Principles
5. Overview
5.1. List of principles
6. Understandability
6.1. Overview
6.2. Rules
6.2.1. Describe the concept
6.2.2. Use dictionary terms
6.2.3. Use problem-domain terms
6.2.4. Use standard terms for domain-
agnostic concepts
6.2.5. Use correct pluralization
6.2.6. Use accurate parts of speech
6.2.7. Include units in measurements
6.2.8. Avoid unconventional single-letter
names
6.2.9. Avoid abbreviations
6.2.10. Avoid non-standard symbolic names
6.2.11. Avoid cleverness
6.2.12. Avoid usage of temporary or
irrelevant concepts
6.2.13. Consider the audience’s familiarity
with the name
7. Conciseness
7.1. Overview
7.2. Rules
7.2.1. Use the appropriate level of
abstraction
7.2.2. Use words with rich meaning
7.2.3. Omit metadata
7.2.4. Omit implementation details
7.2.5. Omit unnecessary words
8. Consistency
8.1. Overview
8.2. Rules
8.2.1. Obey popular naming conventions
8.2.2. Avoid synonyms
8.2.3. Avoid abbreviations
8.2.4. Use similar names for similar
concepts
8.2.5. Use consistent antonyms
9. Distinguishability
9.1. Overview
9.2. Rules
9.2.1. Avoid homographs and near-
homographs
9.2.2. Avoid homophones and near-
homophones
9.2.3. Avoid polysemes
9.2.4. Avoid names with distinct technical
and non-technical meanings
III. Application
10. Overview
11. Tradeoffs
11.1. Overview
11.2. Consistency vs. other principles
11.3. Understandability vs. Conciseness
11.4. Understandability vs. Distinguishability
11.5. Conciseness vs. Distinguishability
12. Identifier types
12.1. Overview
12.2. Classes
12.3. Variables
12.3.1. Booleans
12.3.2. Collections
12.3.3. Hash maps
12.4. Methods
12.5. Method arguments
12.6. Interfaces
12.7. Constants
12.8. Packages/modules/namespaces
13. Style guides
14. Controlled vocabularies
14.1. Further reading
15. Renaming
15.1. Scope
15.2. Principal and interest
15.2.1. Interest costs
15.2.2. Principal costs
15.3. Process
16. Domain-specific names
16.1. Consult domain-related resources
16.2. Consult domain experts
16.3. Consult team members
17. Developing naming skills
17.1. Improving your naming skills
17.1.1. Initial steps
17.1.2. Ongoing learning
17.2. Improving your team’s naming skills
IV. Appendix
18. Common antonyms
19. Visually similar characters
20. Rejected principles
20.1. Length
20.2. Searchability
20.3. Pronounceability
20.4. Meaningfulness
20.5. Austerity
20.6. Accuracy
20.7. Precision
20.8. Concision
References
Acknowledgements
This book is the systemization of the knowledge of many
people with experience in software engineering industry
and academia. I greatly appreciate the sharing of this
knowledge through papers, books, articles, talks,
conversations, and other means. That shared knowledge
was both the inspiration for and the foundation of this
book.
I also want to give a special thank you to everyone who
provided insightful feedback that shaped and refined the
content of this book: Albert Chae, Andy Wang, David Gage,
Joel Eaton, Marian Hlavac, Nathan Miles, Richard Dyce,
Rob Nugen, Tiago Boldt Sousa, Tom Clark, and Zeger
Knops.
I. Introduction
1. Purpose of this book
The naming of identifiers (classes, variables, functions, etc)
is one of the most frequently-used, timeless, and impactful
skills in software engineering. However, it’s rarely
analyzed, poorly understood, and poorly executed.
This book’s goal is to make software engineering more
efficient and enjoyable through better naming. This may
sound audacious, but we can achieve it by applying the
same practical, rigorous structure that we’ve used to
improve other aspects of software engineering.
2. Why is naming important?
— Phil Karlton
Understandability
A name should describe the concept it represents.
Conciseness
A name should use only the words necessary to
communicate the concept it represents.
Consistency
Names should be used and formatted uniformly.
Distinguishability
A name should be visually and phonetically distinguishable
from other names.
6. Understandability
— George Orwell
6.1. Overview
6.2. Rules
# Bad
foo, thing, scooter
# Good
bicycle
# Bad
o, org, people_group
# Good
organization
6.2.3. Use problem-domain terms
# Bad
schedule_events(events)
# Good
schedule_meetings(meetings)
# Bad
Database.remove_record(table, primary_key)
# Good
Database.delete_row(table, primary_key)
# Bad
Process.stop(process_id, signal_name)
# Good
Process.kill(pid, signal_name)
# Good
user = User.where(id=user_id)[0]
# Bad
user = User.where(state='valid')
# Good
users = User.where(state='valid')
# Bad
validate_user(users)
# Good
validate_users(users)
# Bad
user.validation()
# Good
user.validate()
# Bad
user.validate()
# Good
user.is_valid(), user.valid?()
6.2.7. Include units in measurements
# Bad
elapsed_duration
# Good
elapsed_duration_in_days
# Bad
remaining_distance
# Good
remaining_distance_in_meters
# Bad
temperature
# Good
temperature_in_celsius
# Bad
u = users[0]
# Good
user = users[0]
# Bad
l = users.length
row_count = l + 1
# Good
user_count = users.length
row_count = user_count + 1
# Bad
for u in users:
print(u.name)
# Good
# Bad
ap, acts_payable
# Good
accounts_payable
# Bad
org
# Good
organization
# Bad
def ->(&block)
# Good
def map(&block)
6.2.11. Avoid cleverness
# Bad
apply_kevlar(text)
# Good
remove_bullets(text)
# Bad
kill_em_all(processes)
# Good
kill_processes(processes)
— François Fénelon
7.1. Overview
class PhoneNumberPresenter:
def my_new_method(phone_number):
return phone_number.strip()
You could use one of the following names for your new
method:
1. process(phone_number)
2. format(phone_number)
3. trim_whitespace(phone_number)
4. strip(phone_number)
The name process doesn’t give the reader any relevant
information about the operation being performed.
The name format tells the reader that the method will
format the phone number, which is likely approximately the
verb that they’re expecting to find for a method that solves
this requirement.
The name trim_whitespace tells the reader what the
implementation is doing but doesn’t provide information
about its intended use.
The name strip tells the reader exactly what the
implementation is without providing information about its
intended use.
format is the name that best tells the reader the intent of
the method: what it should be used for. This is the
information that is relevant to the reader.
Details about its implementation should not be included
in the name, as 1) they may change, 2) are not relevant for
most readers, and 3) can always be seen by viewing the
implementation.
Thus, a ladder of abstraction can be used to consider
multiple names and then chose the name at the rung in the
ladder that gives the reader the most relevant information
about the concept without describing its implementation.
# Bad
SongCollection
# Good
Album
# Bad
ChildTask
# Good
Subtask
# Good
fetch_user(user_id)
# Bad
first_name_string, first_name_str
# Good
first_name
# Bad
person_list, people_array
# Good
people
There are rare exceptions to this rule. For example, if a
dynamic-typed language is used and knowledge of the type
of a variable is crucial in a specific context, then including
the type in the name may be worthwhile. Similarly, if two
names represent a similar concept but have different types,
a hint about the types within one or both of the names can
make them more understandable.
# Bad
location = 'Chicago'
user.location = Location.geocode(location)
# Good
location_text = 'Chicago'
user.location = Location.geocode(location_text)
# Bad
ComparisonInterface
# Good
Comparable
7.2.4. Omit implementation details
# Bad
csv_processor.process_in_parallel(rows)
# Good
csv_processor.process(rows)
# Bad
csv_processor.process_in_parallel(rows)
csv_processor.process_in_serial(rows)
# Good
csv_processor.process(rows, in_parallel=True)
csv_processor.process(rows, in_parallel=False)
# Bad
user.delete_now()
# Good
user.delete()
# Bad
User.delete_user(user)
# Good
User.delete(user)
# Bad
if !user.is_invalid():
user.save
# Good
if user.is_valid():
user.save
8. Consistency
8.1. Overview
8.2. Rules
# Bad
class User:
def __init__(self):
self.name = None
def self.get_name():
return self.name
def self.set_name(name):
self.name = name
# Good
class User:
def __init__(self, name):
self.name = name
# Bad
class User:
def __init__(self, full_name, first_name, last_name):
self.full_name = full_name
self.first_name = first_name
self.last_name = last_name
class Person:
def __init__(self, full_name, given_name, family_name):
self.full_name = full_name
self.given_name = given_name
self.family_name = family_name
# Good
class User:
def __init__(self, full_name, first_name, last_name):
self.full_name = full_name
self.first_name = first_name
self.last_name = last_name
class Person:
def __init__(self, full_name, first_name, last_name):
self.full_name = full_name
self.first_name = first_name
self.last_name = last_name
Synonyms of more generic concepts should also be
avoided. For example, if “start” and “initiate” are both used
to refer to the same concept of starting a workflow, this
creates an inconsistency which makes it difficult to perform
a search for this concept and may confuse some readers.
However, because the terms are understandable to a broad
audience and have almost the same meaning, they’re less
confusing than domain-specific synonyms, which are
typically less understandable to first-time audiences.
# Bad
onboarding_workflow.start()
offboarding_workflow.initiate()
# Good
onboarding_workflow.start()
offboarding_workflow.start()
# Bad
# Good
# Bad
UserController
CompaniesController
# Good
UsersController
CompaniesController
# Bad
add/uninstall
install/remove
# Good
add/remove
install/uninstall
9.2. Rules
# Bad
# Good
# Bad
class OrganizationOnboardingProcess:
class OrganizationOnboardingProcessor:
# Good
class Onboarding:
class OnboardingProcessor:
Differences in capitalization are not sufficient to make
two names distinguishable. If the names represent different
concepts, they should be changed to more precisely reflect
their concepts.
# Bad
class Datastore:
class DataStore:
# Good
class SqlDatabase:
class InMemoryDatabase:
# Bad
index_0 = 0
index_O = origin_point.index
# Good
initial_index = 0
origin_index = origin_point.index
# Bad
winnowed_results = winnow_results(results)
return window_results(winnowed_results)
# Good
valid_results = select_valid_results(results)
return window_results(valid_results)
# Bad
class Datastore:
class DataStore:
# Good
class SqlDatabase:
class InMemoryDatabase:
# Bad
book = Book.find_by_isbn(isbn)
book.books_in_stock_count
# Good
book = Book.find_by_isbn(isbn)
book.physical_copies_in_stock_count
# Bad
class Class:
# Good
class Course:
12.2. Classes
# Bad
# Good
user_validator = UserValidator(user)
# Bad
# Good
user_validator = UserValidator(user)
user_validator.validate_email()
12.3. Variables
# Bad
person = User.where(id=user_id)[0]
# Good
user = User.where(id=user_id)[0]
# Bad
# Good
12.3.1. Booleans
# Bad
# Good
if process.is_complete():
send_confirmation()
12.3.2. Collections
user_id_to_user
user_ids_users
users_by_id
# Bad
# These two variable names don't make it clear what the variable
contains
user_dict = { 1: user_1, 2: user_2 }
user_ids = { 1: user_1, 2: user_2 }
# Good
12.4. Methods
# Bad
# Good
12.7. Constants
12.8. Packages/modules/namespaces
15.1. Scope
Principal
15.3. Process
The fact that you’re reading this book shows that you’re
already ahead of the curve when it comes to thinking about
naming, but our goal is to improve names by applying this
information.
20.1. Length
20.2. Searchability
20.3. Pronounceability
20.5. Austerity
20.6. Accuracy
20.7. Precision