[Schema Compiler] Detect non-ASCII .idl characters earlier
Non-ASCII characters cause confusing errors on Windows when passed
through stdin. Previously these were detected in idl_schema.py but that
is too late, so the detection is moved to _GetIDLParseError.
This also fixes the one remaining instance of non-ASCII characters in
.idl files (for realz this time).
The old error from these last UTF-8 characters was:
extensions\common\api\declarative_net_request.idl could not be parsed: 'charmap' codec can't encode characters in position 7323-7324: character maps to <undefined>
The new error (now fixed) is this much clearer message:
extensions\common\api\declarative_net_request.idl could not be parsed: Non-ascii character "р" (ord 1088) found at offset 7120.
Continued from brucedawson@'s change here:
https://chromium-review.googlesource.com/c/chromium/src/+/3590992
Bug: 1309977
Change-Id: I39d173e2edc832bc7c856e2681406a40cf1bb97a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3598351
Reviewed-by: Bruce Dawson <[email protected]>
Commit-Queue: Devlin Cronin <[email protected]>
Cr-Commit-Position: refs/heads/main@{#994909}
diff --git a/PRESUBMIT.py b/PRESUBMIT.py
index 0ca5aca..30e051a 100644
--- a/PRESUBMIT.py
+++ b/PRESUBMIT.py
@@ -2593,6 +2593,10 @@
def _GetIDLParseError(input_api, filename):
try:
contents = input_api.ReadFile(filename)
+ for i, char in enumerate(contents):
+ if not char.isascii():
+ return ('Non-ascii character "%s" (ord %d) found at offset %d.'
+ % (char, ord(char), i))
idl_schema = input_api.os_path.join(input_api.PresubmitLocalPath(),
'tools', 'json_schema_compiler',
'idl_schema.py')