Introduction To Conformal Prediction With Python: A Short Guide For Quantifying Uncertainty of Machine Learning Models 1st Edition Christoph Molnar
Introduction To Conformal Prediction With Python: A Short Guide For Quantifying Uncertainty of Machine Learning Models 1st Edition Christoph Molnar
com
https://ebookmeta.com/product/introduction-to-
conformal-prediction-with-python-a-short-guide-
for-quantifying-uncertainty-of-machine-learning-
models-1st-edition-christoph-molnar/
OR CLICK BUTTON
DOWLOAD EBOOK
https://ebookmeta.com/product/learn-tensorflow-2-0-implement-
machine-learning-and-deep-learning-models-with-python-1st-
edition-pramod-singh/
https://ebookmeta.com/product/deep-learning-for-finance-creating-
machine-deep-learning-models-for-trading-in-python-1st-edition-
sofien-kaabar/
https://ebookmeta.com/product/machine-learning-for-knowledge-
discovery-with-r-methodologies-for-modeling-inference-and-
prediction-1st-edition-kao-tai-tsai/
Multivariate Statistical Machine Learning Methods for
Genomic Prediction Montesinos López
https://ebookmeta.com/product/multivariate-statistical-machine-
learning-methods-for-genomic-prediction-montesinos-lopez/
https://ebookmeta.com/product/machine-learning-with-python-
cookbook-2nd-edition-kyle-gallatin/
https://ebookmeta.com/product/machine-learning-with-python-
cookbook-2nd-edition-chris-albon/
https://ebookmeta.com/product/beginners-guide-to-streamlit-with-
python-build-web-based-data-and-machine-learning-
applications-1st-edition-sujay-raghavendra/
https://ebookmeta.com/product/building-machine-learning-and-deep-
learning-models-on-google-cloud-platform-a-comprehensive-guide-
for-beginners-1st-edition-ekaba-bisong/
Introduction To Conformal
Prediction With Python
A Short Guide for Quantifying Uncertainty of Machine
Learning Models
Christoph Molnar
Introduction To Conformal Prediction With
Python
A Short Guide for Quantifying Uncertainty of Machine Learning Models
© 2023 Christoph Molnar, Germany, Munich
christophmolnar.com
For more information about permission to reproduce selections from this book,
write to [email protected].
2023, First Edition
Christoph Molnar
c/o MUCBOOK, Heidi Seibold
Elsenheimerstraße 48
80687 München, Germany
1 Summary 7
2 Preface 9
7 Classification 43
7.1 Back to the beans . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2 The naive method doesn’t work . . . . . . . . . . . . . . . . . . . 45
7.3 The Score method is simple but not adaptive . . . . . . . . . . . 46
4
7.4 Use Adaptive Prediction Sets (APS) for conditional coverage . . . 51
7.5 Top-k method for fixed size sets . . . . . . . . . . . . . . . . . . . 58
7.6 Regularized APS (RAPS) for small sets . . . . . . . . . . . . . . . 59
7.7 Group-balanced conformal prediction . . . . . . . . . . . . . . . . 61
7.8 Class-Conditional APS (CCAPS) for coverage by class . . . . . . 63
7.9 Guide for choosing a conformal classification method . . . . . . . 64
11 Q & A 95
11.1 How do I choose the calibration size? . . . . . . . . . . . . . . . . 95
11.2 How do I make conformal prediction reproducible? . . . . . . . . 95
11.3 How does alpha affect the size of the prediction regions? . . . . . 95
11.4 What happens if I choose a large 𝛼 for conformal classification? . 96
11.5 How to interpret empty prediction sets? . . . . . . . . . . . . . . 96
11.6 Can I use the same data for calibration and model evaluation? . . 96
11.7 What if I find errors in the book or want to provide feedback? . . 97
5
12 Acknowledgements 98
References 99
6
1 Summary
A prerequisite for trust in machine learning is uncertainty quantification. Without
it, an accurate prediction and a wild guess look the same.
Yet many machine learning models come without uncertainty quantification. And
while there are many approaches to uncertainty – from Bayesian posteriors to
bootstrapping – we have no guarantees that these approaches will perform well
on new data.
At first glance conformal prediction seems like yet another contender. But con-
formal prediction can work in combination with any other uncertainty approach
and has many advantages that make it stand out:
Sound good? Then this is the right book for you to learn about this versatile,
easy-to-use yet powerful tool for taming the uncertainty of your models.
This book:
7
• Demonstrates how conformal prediction works for classification and regres-
sion
• Shows how to apply conformal prediction using Python
• Enables you to quickly learn new conformal algorithms
With the knowledge in this book, you’ll be ready to quantify the uncertainty of
any model.
8
2 Preface
My first encounter with conformal prediction was years ago, when I read a paper
on feature importance. I wasn’t looking for uncertainty quantification. Never-
theless, I tried to understand conformal prediction but was quickly discouraged
because I didn’t immediately understand the concept. I moved on.
About 4 years later, conformal prediction kept popping up on my Twitter and
elsewhere. I tried to ignore it, mostly successfully, but at some point I became
interested in understanding what conformal prediction was. So I dug deeper and
found a method that I actually find intuitive.
My favorite way to learn is to teach, so I decided to do a deep dive in the form of an
email course. For 5 weeks, my newsletter Mindful Modeler1 became a classroom
for conformal prediction. I didn’t know how this experiment would turn out. But
it quickly became clear that many people were eager to learn about conformal
prediction. The course was a success. So I decided to build on that and turn
everything I learned about conformal prediction into a book. You hold the results
in your hand (or in your RAM).
I love turning academic knowledge into practical advice. Conformal prediction is
in a sweet spot: There’s an explosion of academic interest and conformal predic-
tion holds great promise for practical data science. The math behind conformal
prediction isn’t easy. That’s one reason why I gave it a pass for a few years. But
it was a pleasant surprise to find that from an application perspective, conformal
prediction is simple. Solid theory, easy to use, broad applicability – conformal
prediction is ready. But it still lives mostly in the academic sphere.
With this book, I hope to strengthen the knowledge transfer from academia to
practice and bring conformal prediction to the streets.
1
https://mindfulmodeler.substack.com/
9
3 Who This Book Is For
This book is for data scientists, statisticians, machine learners and all other mod-
elers who want to learn how to quantify uncertainty with conformal prediction.
Even if you already use uncertainty quantification in one way or another, confor-
mal prediction is a valuable addition to your toolbox.
Prerequisites:
10
4 Introduction to Conformal
Prediction
In this chapter, you’ll learn
11
workers can use the uncertainty estimates to prioritize their review of the
claim and intervene if necessary.
• Uncertainty quantification can be used to improve the user experience in a
banking app. While the classification of financial transactions into “rent,”
“groceries,” and so on can be largely automated through machine learning,
there will always be transactions that are difficult to classify. Uncertainty
quantification can identify tricky transactions and prompt the user to clas-
sify them.
• Demand forecasting using machine learning can be improved by using un-
certainty quantification, which can provide additional context on the con-
fidence in the prediction. This is especially important in situations where
the demand must meet a certain threshold in order to justify production.
By understanding the uncertainty of the forecast, an organization can make
more informed decisions about whether to proceed with production.
ė Note
As a rule of thumb, you need uncertainty quantification whenever a point
prediction isn’t informative enough.
• The model is trained on a random sample of data, making the model itself
a random variable. If you were to train the model on a different sample
from the same distribution, you would get a slightly different model.
• Some models are even trained in a non-deterministic way. Think of random
weight initialization in neural networks or sampling mechanisms in random
forests. If you train a model with non-deterministic training twice on the
same data, you will get slightly different models.
• This uncertainty in model training is worse when the training dataset is
small.
12
• Hyperparameter tuning, model selection, and feature selection have the
same problem – all of these modeling steps involve estimation based on
random samples of data, which adds to uncertainty to the modeling process.
• The data may not be perfectly measured. The features or the target may
contain measurement errors, such as people filling out surveys incorrectly,
copying errors, and faulty measurements.
• Data sets may have missing values.
Some examples:
• Let’s say we’re predicting house values. The floor type feature isn’t always
accurate, so our model has to work with data that contains measurement
errors. For this and other reasons, the model will not always predict the
house value correctly.
• Decision trees are known to be unstable – small changes in the data can lead
to large differences in how the tree looks like. While this type of uncertainty
is “invisible” when only one tree is trained, it becomes apparent when the
model is retrained, since a new tree will likely have different splits.
• Image classification: Human labelers may disagree on how to classify an
image. A dataset consisting of different human labelers will therefore con-
tain uncertainty as the model will never be able to perfectly predict the
“correct” class, because the true class is up for debate.
13
(a) Clearly A Dog (b) Don’t let these dogs bamboozle you.
They want you to believe that they are
ghosts. They are not!
14
One classification is quite clear, because the probability is so high. In the other
case, it was a close call for the “cat” category, so we would assume that this
classification was less certain.
At first glance, aren’t we done when the model outputs probabilities and we use
them to get an idea of uncertainty? Unfortunately, no. Let’s explore why.
15
The main problem is that these approaches don’t come with any reasonable1
guarantee that they cover the true outcome (Niculescu-Mizil and Caruana 2005;
Lambrou et al. 2012; Johansson and Gabrielsson 2019; Dewolf et al. 2022).
ė Naive Approach
The naive approach is to take at face value the uncertainty scores that the
model spits out - confidence intervals, variance, Bayesian posteriors, multi-
class probabilities. The problem: you can’t expect these outcomes to be well
calibrated.
1
Some methods, such as Bayesian posteriors actually do have guarantees that they cover the
true values. However, this depends on modeling assumptions, such as the priors and data
distributions. Such distributional assumptions are an oversimplification for practically all
real applications and are likely to be violated. Therefore, you can’t count on coverage
guarantees that are based on strong assumptions.
16
Conformal prediction changes what a prediction looks like: it turns point pre-
dictions into prediction regions.2 For multi-class classification it turns the class
output into a set of classes:
17
Warning
Before we delve into theory and intuition, let’s see conformal prediction in ac-
tion.
18
5 Getting Started with Conformal
Prediction in Python
In this chapter, you’ll learn:
• Python (3.10.7)
• scikit-learn (1.2.0)
• MAPIE1 (0.6.1)
• pandas (1.5.2)
• matplotlib (3.6.2)
Before we dive into any kind of theory with conformal prediction let’s just get a
feel for it with a code example.
1
https://mapie.readthedocs.io/en/latest/index.html
19
5.2 Let’s classify some beans
A (fictional) bean company uses machine learning to automatically classify dry
beans2 into 1 of 7 different varieties: Barbunya, Bombay, Cali, Dermason, Horoz,
Seker, and Sira.
The bean dataset contains 13,611 beans (Koklu and Ozkan 2020). Each row is a
dry bean with 8 measurements such as length, roundness, and solidity, in addition
to the variety which is the prediction target.
The different varieties have different characteristics, so it makes sense to classify
them and sell the beans by variety. Planting the right variety is important for
reasons of yield and disease protection. Automating this classification task with
machine learning frees up a lot of time that would otherwise be spent doing it
manually.
Here is how to download the data:
import os
import wget
import zipfile
from os.path import exists
2
Dry beans are not to be confused with dried beans. Well, you buy dry beans dried, but not
all dried beans are dry beans. Get it? Dry beans are a type of bean (small and white) eaten
in Turkey, for example.
20
The model was trained in ancient times by some legendary dude who left the
company a long time ago. It’s a Naive Bayes model. And it sucks. This is his
code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
21
Instead of splitting the data only into training and testing, we split the 13,611
beans into:
Accuracy: 0.758
BARBUNYA BOMBAY CALI DERMASON HOROZ SEKER SIRA
BARBUNYA 46 0 47 0 6 0 4
BOMBAY 0 33 0 0 0 0 0
CALI 20 0 81 0 3 0 0
DERMASON 0 0 0 223 0 32 9
HOROZ 0 0 4 3 104 0 22
SEKER 2 0 0 26 1 127 22
SIRA 0 0 0 10 10 21 144
75.80% of the beans in the test data are classified correctly. How to read this
confusion matrix: rows indicate the true classes and columns the predicted classes.
For example, 47 BARBUNYA beans were falsely classified as CALI.
The classes seem to have different classification difficulties, for example Bombay
is always classified correctly in the test data, but Barbunya only half of the time.
22
Overall the model is not the best model.3
Unfortunately, the model can’t be easily replaced because it’s hopelessly inter-
twined with the rest of the bean company’s backend. And nobody wants to be
the one to pull the wrong piece out of this Jenga tower of a backend.
The dry bean company is in trouble. Several customers have complained that
they bought bags of one variety of beans but there were too many beans of other
varieties mixed in.
The bean company holds an emergency meeting and it’s decided that they will
offer premium products with a guaranteed percentage of the advertised bean
variety. For example, a bag labeled “Seker” should contain at least 95% Seker
beans.
3
Other models, like random forest, are more likely to be calibrated for this dataset. But I
found that out later, when I was already pretty invested in the dataset. And I liked the data,
so we’ll stick with this example. And it’s not that uncommon to get stuck with suboptimal
solutions in complex systems, like legacy code, etc.
23
# Get the "probabilities" from the model
predictions = model.predict_proba(X_calib)
# Get for each instance the highest probability
high_prob_predictions = np.amax(predictions, axis=1)
# Select the predictions where probability over 99%
high_p_beans = np.where(high_prob_predictions >= 0.95)
# Let's count how often we hit the right label
its_a_match = (model.predict(X_calib) == y_calib)
coverage = np.mean(its_a_match.values[high_p_beans])
print(round(coverage, 3))
0.896
Ideally, 95% or more of the beans should have the predicted class, but she finds
that the 95%-bag only contains 89.6% of the correct variety.
Now what?
She could use methods such as Platt scaling or isotonic regression to calibrate
these probabilities, but again, with no guarantee of correct coverage for new
data.
But she has an idea.
24
𝑠𝑖 = 1 − 𝑓(𝑥𝑖 )[𝑦𝑖 ]
A slightly sloppy notation for saying that we take 1 minus the model score for
the true class. For example, if the ground truth for bean number 8 is “Seker”
and the probability score for Seker is 0.9, then 𝑠8 = 0.1. In conformal prediction
language, this 𝑠𝑖 -score is called non-conformity score.
ė Non-conformity score
The non-conformity score 𝑠𝑖 for a new data point measures how unusual a
suggested outcome 𝑦 seems like given the model output for 𝑥𝑖 . To decide
which of the possible 𝑦’s are “conformal” (and together form the prediction
region), conformal prediction calculates a threshold. This threshold is based
on the non-conformity scores of the calibration data in combination with
their true labels.
The threshold is therefore chosen to cover 95% of the true bean classes.
In Python, this procedure can be done in just a few lines of code:
25
# Setting the alpha so that we get 95% prediction sets
alpha = 0.05
# define quantile
q_level = np.ceil((n+1)*(1-alpha))/n
qhat = np.quantile(scores, q_level, method='higher')
26
500
Frequency 400
300
200
100
0
0.0 0.2 0.4 0.6 0.8 1.0
1 - s(y,x)
ė Prediction Set
A prediction set – for multi-class tasks – is a set of one or more classes.
Conformal classification gives you a set for each instance.
To generate the prediction sets for a new data point, the data scientist has to
combine all classes that are below the threshold 𝑞 ̂ into a set.
27
for i in range(3):
print(le.classes_[prediction_sets[i]])
['DERMASON']
['DERMASON']
['DERMASON' 'SEKER']
On average, the prediction sets cover the true class with a probability of 95%.
That’s the guarantee we get from the conformal procedure.
How could the bean company work with such prediction sets? The first set has
only 1 bean variety “DERMASON”, so it would go into a DERMASON bag.
Beans #3 has a prediction set with two varieties. Maybe a chance to offer bean
products with guaranteed coverage, but containing two varieties? Anything with
more categories could be sorted manually, or the CEO could finally make bean
stew for everyone.
The CEO is now more relaxed and confident in the product.
Spoiler alert: the coverage guarantees don’t work the way the bean CEO thinks
they do, as we will soon learn (what they actually need is a class-wise coverage
guarantee that we will learn about in the classification chapter.
And that’s it. You have just seen conformal prediction in action. To be exact,
this was the score method that you will encounter again in the classification
chapter.
4
https://mapie.readthedocs.io/en/latest/index.html
28
from mapie.classification import MapieClassifier
We’re no longer working with the Naive Bayes model object, but our model is
now a MapieClassifier object. If you are familiar with the sklearn library, it will
feel natural to work with objects in MAPIE. These MAPIE objects have a .fit-
function and a .predict()-function, just like sklearn models do. MapieClassifier
can be thought of as wrapper around our original model.
And when we use the “predict” method of this conformal classifier, we get both the
usual prediction (“y_pred”) and the sets from the conformal prediction (“y_set”).
It’s possible to specify more than one value for 𝛼. But in the code above only 1
value was specified, so the resulting y_set is an array of shape (1000, 7, 1), which
means 1000 data points, 7 classes, and 1 𝛼. The np.squeeze function removes the
last dimension.
Let’s have a look at some of the resulting prediction sets. Since the cp_score
only contains “True” and “False” at the corresponding class indices, we have to
use the class labels to get readable results. Here are the first 5 prediction sets for
the beans:
29
for i in range(5):
print(le.classes_[y_set[i]])
['DERMASON']
['DERMASON']
['DERMASON' 'SEKER']
['DERMASON']
['DERMASON' 'SEKER']
These prediction sets are of size 1 or 2. Let’s have a look at all the other beans
in X_new:
2 871
1 506
3 233
4 1
dtype: int64
Most sets have size 1 or 2, many fewer have 3 varieties, only one set has 4 varieties
of beans.
This looks different if we make 𝛼 small, saying that we want a high probability
that the true class is in there.
30
['DERMASON']
['DERMASON' 'SEKER']
['DERMASON' 'SEKER']
['DERMASON']
set_sizes = y_set.sum(axis=1)
print(pd.Series(set_sizes).value_counts())
3 780
2 372
4 236
1 222
5 1
dtype: int64
As expected, we get larger sets with a lower value for 𝛼. This is because the
lower the 𝛼, the more often the sets have to cover the true parameter. So we can
already see that there is a trade-off between set size and coverage. We pay for
higher coverage with larger set sizes. That’s why 100% coverage (𝛼 = 0) would
produce a stupid solution: it would just include all bean varieties in every set for
every bean.
If we want to see the results under different 𝛼’s, we can pass an array to MAPIE.
MAPIE will then automatically calculate the sets for all the different 𝛼 confidence
levels. We just have to make sure that we use the third dimension to pick the
right value:
['HOROZ' 'SIRA']
31
We can also create a pandas DataFrame to hold our results, which will print
nicely:
for i in range(len(y_pred)):
predset = le.classes_[y_set[i]]
# Create a new dataframe with the calculated values
temp_df = pd.DataFrame({
"set": [predset],
"setsize": [len(predset)]
}, index=[i])
# Concatenate the new dataframe with the existing one
df = pd.concat([df, temp_df])
print(df.head())
set setsize
0 [DERMASON] 1
1 [DERMASON] 1
2 [DERMASON, SEKER] 2
3 [DERMASON] 1
4 [DERMASON, SEKER] 2
Working with conformal prediction and MAPIE is a great experience. But are
the results really what the bean company was looking for? We’ll learn in the
Classification chapter why the bean CEO may have been celebrating too soon. A
hint: the coverage guarantee of the conformal predictor only holds on average –
not necessarily per class.
ė Coverage
32
6 Intuition Behind Conformal
Prediction
In this chapter, you will learn
Let’s say you have an image classifier that outputs probabilities, but you want
prediction sets with guaranteed coverage of the true class.
First, we sort the predictions of the calibration dataset from certain to uncertain.
The calibration dataset must be separate from the training dataset. For the
image classifier, we could use 𝑠𝑖 = 1 − 𝑓(𝑥𝑖 )[𝑦𝑖 ] as the so-called non-conformity
score, where 𝑓(𝑥𝑖 )[𝑦𝑖 ] is the model’s probability output for the true class. This
procedure places all images somewhere on a scale of how certain the classification
is, as shown in the following figure.
Models have a tendency to overfit the training examples, which in turn biases
their non-conformity scores. If we were to calibrate using the training data,
it’s likely that the threshold would be too small and therefore the coverage
would be too low (less than 1 − 𝛼). The guaranteed coverage only works by
calibrating with data that wasn’t used to train the model.
The dog on the left has a model output of 0.95 and therefore gets s = 0.05, but
the dogs on the right in their spooky costumes bamboozle the neural network.
This spooky image gets a score of only 0.15 for the class dog, which translates
into a score of s = 0.85.
33
Figure 6.1: Images from calibration data sorted from certain to uncertain
We rely on this ordering of the images to divide the images into certain (or
conformal) and uncertain. The size of each fraction depends on the confidence
level 𝛼 that the user chooses.
If 𝛼 = 0.1, then we want to have 90% of the mismatches in the “certain” section.
Finding the threshold is easy because it means calculating the quantile 𝑞:̂ the
score value where 90% (= 1 − 𝛼) of the images are below and 10% (= 𝛼) are
above:
In this example, the scary dogs fall into the uncertain region.
Another assumption that conformal prediction requires is exchangeability.
ė Exchangeability
For the coverage guarantee to hold, the calibration data must be “exchange-
able” with the new data we expect. For example, if they are randomly drawn
from the same distribution, they are exchangeable. If they come from differ-
ent distributions, they may not be exchangeable.
Time series data, for example, are not exchangeable, since the temporal order
matters. We will see how conformal prediction can still be adapted for such cases.
34
Figure 6.2: The threshold divides images along the uncertainty scale into certain
and uncertain.
35
having too many “wrong” labels in the prediction sets. CP researchers therefore
always look at the average size of prediction sets. Given that two CP algorithms
provide the same guaranteed coverage, the preferred algorithm is the one that
produces smaller prediction sets. In addition, some CP algorithms guarantee
upper bounds on the coverage probability, which also keeps the sets small.
Let’s move on to the conformal prediction step.
For a new image, we check all possible classes: compute the non-conformity score
for each class and keep the classes where the score falls below the threshold 𝑞.̂
All scores below the threshold are conformal with scores that we observed in the
calibration set and are seen as certain enough (based on 𝛼).
In this example, the image has the prediction set {cat, lion} because both classes
are “conformal” and made the cut. All other class labels are too uncertain and
therefore excluded.
36
Now perhaps it is clearer what happens to the “wrong classes”: If the model
is worth its money, the probabilities for the wrong classes will be rather low.
Therefore the non-conformity score will probably be above the threshold and the
corresponding classes will not be included in the prediction set.
37
In the case of classification, the y’s are classes and for regression, the y’s are all
possible values that could be predicted.
A big differentiator between conformal prediction algorithms is the choice of the
non-conformity score. In addition, they can differ in the details of the recipe and
slightly deviate from it as well. In a way, the recipe isn’t fully accurate, or rather
it’s about a specific version of conformal prediction that is called split conformal
prediction. Splitting the data only once into training and calibration is not the
best use of data. If you are familiar with evaluation in machine learning, you
won’t be surprised about the following extensions.
Do these sound familiar to you? If you are familiar with evaluating and tuning ma-
chine learning algorithms, then you already know these resampling strategies.
For evaluating or tuning machine learning models, you also have to work with
data that was not used for model training. So it makes sense that we encounter
the same options for conformal prediction where we also have to find a balance
between training the model with as much data as possible, but also having access
to “fresh” data for calibration.
38
Figure 6.4: Different strategies for splitting data into training and calibration
sets.
For cross-conformal prediction, you split the data, for example, into 10 pieces.
You take the first 9 pieces together to train the model and compute the non-
conformity scores for the remaining 1/10th. You repeat this step 9 times so that
each piece is once in the calibration set. You end up with non-conformity scores
for the entire dataset and can continue with computing the quantile for conformal
prediction as in the single split scenario.
If you take cross-conformal prediction to the extreme you end up with the leave-
one-out (LOO) method, also called jackknife, where you train a total of n models,
each with n-1 data points (𝑛 is number of data points in both training and
calibration).
All three options are inductive approaches to conformal prediction. Another
approach is transductive or full conformal prediction.
39
diction region: To get the prediction set for a new data point, the model
has to be retrained for every possible value of 𝑦𝑛𝑒𝑤 . Transductive CP isn’t
covered in this book.
cp = MapieRegressor(model, cv="prefit")
cp = MapieRegressor(model, cv=10)
Warning
If you don’t specify the cv option at all, MAPIE will use 5-fold cross-splitting
– even if you have already trained your model.
Entering the calibration step is the same for all “cv” options – with cross-splitting
or LOO it just takes longer because the model is trained multiple times.
cp.fit(x_calib, y_calib)
40
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of A note on the
position and extent of the great temple
enclosure of Tenochtitlan,
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Language: English
OF THE
OF THE
TEOCOLLI OF
HUITZILOPOCHTLI.
BY
ALFRED P. MAUDSLAY.
LONDON:
PRINTED BY TAYLOR & FRANCIS, RED LION COURT, FLEET
STREET, E.C.
1912.
A NOTE
ON THE POSITION AND EXTENT
OF THE
GREAT TEMPLE ENCLOSURE OF
TENOCHTITLAN
AND THE POSITION, STRUCTURE, AND
ORIENTATION
OF THE
TEOCALLI OF HUITZILOPOCHTLI.
BY
ALFRED P. MAUDSLAY.
“Then the said Señores ... assign as a street for the exit and service
of the said Solares ... a space of 14 feet, which street must pass
between the Solar of Alonzo de Villanueva and that of Luis de la
Torre and pass through to the site of the Church, on one side being
the Solar of Juan de la Torre, and on the other the Solar of Gonzalo
de Alvarado.”
In the same note Icazbalceta discusses the measurements of the
Solares, which appear to have varied between 141 × 141 Spanish feet
(= 130 ¾′ × 130¾′ English) and 150 × 150 Spanish feet (= 139′ ×
139′ English), which latter measurement was established by an Act of
the Cabildo in Feb. 1537. He also printed with the note a plan of what
he considered to be the position of the Solares dealt with in this Act
of Cabildo. This plan is incorporated in Tracing A1.
Plate C is a copy of a plan of the Temple Enclosure found with a
Sahagun MS., preserved in the Library of the Royal Palace at Madrid
and published by Dr. E. Seler in his pamphlet entitled ‘Die
Ausgrabungen am Orte des Haupttempels in Mexico’ (1904).
We know from Cortés’s own account, confirmed by Gomara, that
the Great Teocalli was so close to the quarters of the Spaniards that
the Mexicans were able to discharge missiles from the Teocalli into
the Spanish quarters, and according to Sahagun’s account the
Mexicans hauled two stout beams to the top of the Teocalli in order
to hurl them against the Palace of Axayacatl so as to force an
entrance. It was on this account Cortés made such a determined
attack on the Teocalli and cleared it of the enemy.
We also know from the Acts of the Cabildo that the group of
Solares beginning with that of Cristóbal Flores (Nos. 1–9) are
described as “frontero del Huichilobos,” i. e. opposite (the Teocalli
of) Huichilobos, and we also learn that the Solar of Alonzo de Avila
was “en la tercia parte donde estaba el Huichilobos,” i. e. in the third
part or portion where (the Teocalli of) Huichilobos stood. Alaman
confesses that he cannot understand this last expression, but I
venture to suggest that as the Temple Enclosure was divided
unevenly by the line of the Calle de Iztapalapa, two-thirds lying to the
West of that line and one-third to the East of it, the expression
implies that the Teocalli was situated in the Eastern third of the
Enclosure. This would bring it sufficiently near to the Palace of
Axayacatl for the Mexicans to have been able to discharge missiles
into the quarters of the Spaniards. It would also occupy the site of
the Solar de Alonzo de Avila, and might be considered to face the
Solar of Cristóbal Flores and his neighbours, and we should naturally
expect to find it in line with the Calle de Tacuba. Sahagun’s plan is
not marked with the points of the compass, but if we should give it
the same orientation as Tracing A2, the Great Teocalli falls fairly into
its place.
Measurements of the Great Teocalli.
There were two values to the Braza or Fathom in old Spanish
measures, one was the equivalent of 65·749 English inches, and the
other and more ancient was the equivalent of 66·768 English inches.
In computing the following measurements I have used the latter
scale:—
Spanish. English.
1 foot = 11·128 inches.
3 feet = 1 vara = 33·384 „ = 2·782 feet.
2 varas = 1 Braza = 66·768 „ = 5·564 „
The Pace is reckoned as equal to 2·5 English feet and the Ell
mentioned by Tezozomoc as the Flemish Ell = 27·97 English inches
or 2·33 English feet.
There is a general agreement that the Teocalli was a solid
quadrangular edifice in the form of a truncated step pyramid.
The dimensions of the Ground plan are given as follows:—
The Stairway.
Torquemada says that the steps were each one foot high, and
Duran describes the difficulty of raising the image and litter of the
God from the ground to the platform on the top of the Teocalli owing
to the steepness of the steps and the narrowness of the tread.
The sides and back of the Teocalli were in the form of great
steps.