LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 36144 - Operands of the form '0b' in jump instructions in X86 Intel dialect inline assembly are not recognised as valid.
Summary: Operands of the form '0b' in jump instructions in X86 Intel dialect inline as...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: Macintosh MacOS X
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks: 10988
  Show dependency tree
 
Reported: 2018-01-30 00:48 PST by Tom Murray
Modified: 2018-10-24 13:35 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s): r345189


Attachments
C++ source file, compilation of which with Clang reproduces the issue (638 bytes, text/plain)
2018-01-30 00:48 PST, Tom Murray
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Murray 2018-01-30 00:48:28 PST
Created attachment 19771 [details]
C++ source file, compilation of which with Clang reproduces the issue

Overview:
Using operands of the form '0b' (a numbered label appearing earlier in source relative to current source position) in jump instructions (any jump instruction, jmp, je, jne, jz, jnz etc.) in inline 
 intel dialect x86 assembly causes an 'Invalid operand for instruction' compilation error. In earlier versions (specific versions listed below) of clang this compiled correctly, producing code with the jmp target replaced with the correct label generated from the numbered label in the inline assembly.

Steps to Reproduce:
1. Download the attached source 'x86IntelInlineAsmJmpToLabelRelativeTest.cpp'
2. Using a version of clang 5.0 or greater (including trunk), attempt to compile this using: clang x86IntelInlineAsmJmpToLabelRelativeTest.cpp -o x86IntelInlineAsmJmpToLabelRelativeTest
3. Observe the compilation error on line 17 of x86IntelInlineAsmJmpToLabelRelativeTest.cpp - "Invalid operand for instruction"

Actual Results:
The program fails to compile.

Expected Results:
The program compiles successfully, with the target of the jump instruction replaced with the correct label generated from the numbered label in the inline assembly.

Build Date & Hardware where bug was first encountered:
26 Jan 2018 - Xcode 9.3 Beta 1 (9Q98q), Apple LLVM version 9.1.0 (clang-902.0.30) - Mac OS 10.13.3 (17D47)

Additional Builds and Platforms:
Clang 5.0.0 (non-Xcode version) release reproduced the issue.
Clang 4.0.0 and 4.0.1 (non-Xcode version) releases did not reproduce the issue.
Locally compiled build of Clang on trunk (@ SVN revision 323529) reproduced the issue.

Additional information:
The attached code performs the same operation twice, first using Intel syntax, then using AT&T syntax to demonstrate the issue exists only in the Intel syntax path.

A brief investigation seems to show this arises from an ambiguity when parsing operands to instructions in Intel syntax after handling of MASM style Intel syntax was added in r280555. Because MASM allows integer literals of the form '011010b', '0b' is a valid integer literal representing value 0. The code in lib/MC/MCParser/AsmLexer.cpp - llvm::AsmLexer::LexDigit() with MASM style Intel assembly handling consumes the 'b' suffix on the literal. This means the special handling of positionally relative jump targets in lib/Target/X86/AsmParser/X86AsmParser.cpp - X86AsmParser::ParseIntelExpression() can no longer correctly detect this form of jump target, and incorrectly identifies the jump instruction's operand as just an integer which causes a compilation error.
Comment 1 Jeffrey Walton 2018-04-12 11:42:12 PDT
(In reply to Tom Murray from comment #0)
> ...
> Using operands of the form '0b' (a numbered label appearing earlier in
> source relative to current source position) in jump instructions (any jump
> instruction, jmp, je, jne, jz, jnz etc.) in inline 
>  intel dialect x86 assembly causes an 'Invalid operand for instruction'
> compilation error. In earlier versions (specific versions listed below) of
> clang this compiled correctly, producing code with the jmp target replaced
> with the correct label generated from the numbered label in the inline
> assembly.

The problem is bigger than just 0b. Crypto++ is experiencing it with 1b, also:

   "1:\n"
   ...
   "jnz 1b;\n"

Thanks to @mouse07410 and @alanbirtles at https://github.com/weidai11/cryptopp/issues/636 .
Comment 2 Francis Visoiu Mistrih 2018-07-05 08:26:19 PDT
My understanding of this is that https://reviews.llvm.org/rL301390 sets

> getParser().setParsingInlineAsm(true);

whenever it sees .intel_syntax. From my reading of the previous comments in various reviews / bugs, I understand that we rely on isParsingInlineAsm to actually check if we're parsing MSVC-style Intel syntax asm.

This breaks both things like the example Jeffrey attached:

> 1:
>   jnz 1b

and any binary immediate like:

> .intel_syntax
> and edi, 0b010101

What is the preferred way to fix this? Should we introduce a new flag for clang to set in ParseMicrosoftAsmAsmStatement and provide a hidden flag in LLVM to be able to test it independently?
Comment 3 Francis Visoiu Mistrih 2018-07-31 09:52:57 PDT
Ping? I think this is important to fix or revert as it's a pretty basic thing to support.
Comment 4 Reid Kleckner 2018-10-22 16:06:19 PDT
I applied https://reviews.llvm.org/D53535 and it seems to fix the issue.
Comment 5 Reid Kleckner 2018-10-24 13:35:35 PDT
Fixed by r345189.