Get assembly, LLVM bytecode or clang AST output

Wolf_Vollprecht · September 24, 2018, 4:22pm

Hi all,

I am wondering wether it’s already possible to get the generated assembly, LLVM bytecode or clang AST output from the cling interpreter.

This would be a useful tool for performance debugging or teaching.

I’ve been looking a bit at Julia, and they have macros (@code_native, @code_llvm, @code_typed) for these three features.

Just wondering if it doesn’t exist, how much work it would be to add such a feature?

Could be nice to implement that as a jupyter magic for the notebook.

Cheers,

Wolf

Axel · September 24, 2018, 6:18pm

Hi Wolf,

We have been discussing such a feature again and again - we just never found the time to do it! I.e. yes this would be a fantastic contribution!

A simple way if implementing this is to:

add an interpreter state that is toggled by .trace, and if enabled:
dump the AST of the transaction
dump the IR / module that CodeGen creates of the transaction
and then maybe even show the symbols that OrcJIT emitted.

At a later stage we could have .trace symbol and only trace the declarations / symbols whose names contain "symbol". We already have a rudimentary start of this, see MetaParser::istraceCommand().

That would indeed be a super useful tool: for teaching, for understanding and debugging code, and even for debugging cling!

Let me know whether you’d be interested in trying to contribute this. We would be happy to help!

Cheers, Axel.

Wolf_Vollprecht · September 25, 2018, 3:19pm

Hi Axel,

thanks for the reply!

I just tried out .trace ast and was delighted to see that indeed, it does print out something. However, it’s a lot and I couldn’t easily decipher where it starts and ends

Are there more docs somewhere around .trace? do you have to start tracing, and then stop at some point + print?

Sure, I’d be happy to contribute. I’ll have to figure out how much work it is, though.

Cheers!

Wolf

Wolf_Vollprecht · September 25, 2018, 3:28pm

Ok, actually I just read the code and think I’ve got a good enough understanding.

However, for cling after startup, the AST is already huge. Just wondering why that is. Are there some headers included when starting up cling?

Maybe making .trace actually start/end tracing could make sense? Then one could record the AST portion that one is actually interested in.

For the jupyter magic, we’d just start tracing at the beginning, and stopping at the end of the cell, trivially.

Wolf_Vollprecht · September 25, 2018, 5:24pm

So far I’ve been able to hack cling so that it prints the LLVM bytecode for every execution. That was actually really simple, as I am just calling M->dump() (or M->print(...)) on the generated module.

I found the sources of Julia that do the tricks around disassembling the compiled output here: https://github.com/JuliaLang/julia/blob/master/src/disasm.cpp

So maybe getting something to work is not so hard after all.

Wolf_Vollprecht · September 25, 2018, 5:56pm

If I would connect the disassembler (adopted e.g. from the Julia source) to this callback (https://github.com/root-project/cling/blob/master/lib/Interpreter/IncrementalJIT.h#L57) in IncrementalJIT (which was used by GDB apparently, would that work?

Axel · September 26, 2018, 3:37pm

What about submitting a PR for this? If so - let’s do it step by step! I’d add the disassembler in a second step.

I will need to re-attach GDB at some point, so make use of that callback. But we can have two clients listening there!

Wolf_Vollprecht · September 27, 2018, 5:50pm

@Axel I’ve posted the “progress” so far in a github PR.

Note that the code is hacky, I just wanted to see how far I can quickly get.

Maybe you can help me get over the bump?

system · October 11, 2018, 5:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.