C++ API Registering gcc variables into cling (part 2)

Hi All,

TL;DR trying to figure out if its possible to registering std::functions (lambda) to the interpreter, and executing them at the same time all, while in a single scope, such that I can run the same script endlessly without worrying about cling complaining about re-declaring variables.

I’m using root 6.06.02 compiled with both gcc530 and gcc 4.9.3 on various flavors of linux (the gcc version doesn’t bother me at the moment)

Right now I’m tryting to register functions from my compiled (gcc) session into my interpreter in a very recyclable manner.

What I’m trying to do essentially is to run scripts that are all scoped. In other words, I want to only pass “{ [string_to_interpret] }” to the interpreter, such that all of the variables I tell cling to allocate get deallocated and/or forgotten.

There are issues with using unload() (sft.its.cern.ch/jira/browse/ROOT-7939)
and child interpreters aren’t supported yet (API: scopes and curley braces) (I promised Axel to help out but I never got up to it yet <img src="/uploads/default/original/2X/8/84c2fe9464a4066c00e1bd5978e913e7869cbb07.gif" width=“22” height=“16” alt=":-"" title=“Whistle”/> )

I also don’t know of any way of copying a cling::Interpreter instance.

DISCLAIMER: in this example, I’m using boost::optional.

#include <iostream>
#include <cstdio>
#include <boost/optional.hpp>
#include <cling/Interpreter/Interpreter.h>
#include <functional>

static const std::string PATH_TO_LLVM {"/home/d4nf/Custom_Libs/Root-CERN/root-6.06.00.gcc530/etc/cling"};
static const char * clingArgs[] = {"", "-std=c++11"};


struct Mobject
{
    void set(double _d) {d = _d;}
    double get() {return d.get();};
private:
    boost::optional<double> d;
    boost::optional<uint32_t> b;
    boost::optional<int32_t> a;
};

int main()
{
    Mobject foo, foo2, foo3;
    foo2.set(33.334);
    cling::Interpreter inter(2, clingArgs, PATH_TO_LLVM.c_str());
    //inter.AddIncludePath("/usr/include");
    inter.declare("#include <functional>\n #include <boost/optional.hpp>");
    std::function<void(double)> fun_setter = [&foo] (double d_cpy) -> void { foo.set(d_cpy);};
    std::function<double()> fun_getter = [&foo] () -> double {return foo.get();};
    std::function<void()> printHello = []() {std::cout << "Hi Everyone!!!\n";};


    uintptr_t setter_addr = reinterpret_cast<uintptr_t>(&fun_setter);
    uintptr_t getter_addr = reinterpret_cast<uintptr_t>(&fun_getter);
    uintptr_t hello_addr = reinterpret_cast<uintptr_t>(&printHello);
    std::array<char, 4096> toExec;

    int BytesWritten = 0;
    BytesWritten = std::sprintf( &toExec[BytesWritten], "%s * fun_setter_mangle = reinterpret_cast<%s*>(%llu);",
                  "std::function<void(double)>", "std::function<void(double)>", setter_addr);
    BytesWritten += std::sprintf( &toExec[BytesWritten], "%s & fun_setter = *fun_setter_mangle;",
                  "std::function<void(double)>");

    BytesWritten += std::sprintf( &toExec[BytesWritten], "%s * fun_getter_mangle = reinterpret_cast<%s*>(%llu);",
                  "std::function<double()>", "std::function<double()>", getter_addr);
    BytesWritten += std::sprintf( &toExec[BytesWritten], "%s & fun_getter = *fun_getter_mangle;",
                  "std::function<double()>");
    inter.process(toExec.data());
    BytesWritten = 0;

    BytesWritten += std::sprintf( &toExec[BytesWritten], "%s * printHello_mangle = reinterpret_cast<%s*>(%llu);",
                  "std::function<void()>", "std::function<void()>", hello_addr);
    BytesWritten += std::sprintf( &toExec[BytesWritten], "%s & printHello = *printHello_mangle;",
                  "std::function<void()>");

    const char proxyStruct[] =
    "struct MyProxy{"
        "void set (double d) {fun_setter(d); }"
        "double get () {return fun_getter(); }"
    "}; MyProxy foo2;";
    BytesWritten += std::sprintf( &toExec[BytesWritten], proxyStruct);
    BytesWritten += std::sprintf( &toExec[BytesWritten], " printHello();");

    foo.set(11234.22);
    std::cout << "foo is: " << foo.get() << "\n";

    inter.process(toExec.data());

    inter.process("foo2.set(45.123);");
            std::cout << "foo is: " << foo.get() << "\n";

}

The output here is exactly as I desire

[quote]./tcling
foo is: 11234.2
Hi Everyone!!!
foo is: 45.123
[/quote]

However, when I comment out lines 51,52[quote]
// inter.process(toExec.data());
// BytesWritten = 0;[/quote]

[quote]./tcling
foo is: 11234.2
input_line_4:2:536: error: reference to local variable ‘fun_setter’ declared in enclosing function ‘__cling_Un1Qu30’
…& printHello = printHello_mangle;struct MyProxy{void set (double d) {fun_setter(d); }double get () {return fun_getter(); }}; MyProxy foo2; printHello();
^
input_line_4:2:146: note: ‘fun_setter’ declared here
std::function<void(double)> * fun_setter_mangle = reinterpret_cast<std::function<void(double)>
>(140737143517008);std::function<void(double)> & fun_setter = …
^
input_line_4:2:574: error: reference to local variable ‘fun_getter’ declared in enclosing function ‘__cling_Un1Qu30’
…& printHello = *printHello_mangle;struct MyProxy{void set (double d) {fun_setter(d); }double get () {return fun_getter(); }}; MyProxy foo2; printHello();
^
input_line_4:2:310: note: ‘fun_getter’ declared here
fun_setter_mangle;std::function<double()> * fun_getter_mangle = reinterpret_cast<std::function<double()>>(140737143517040);std::function<double()> & fun_get…
^
input_line_5:2:2: error: use of undeclared identifier ‘foo2’
foo2.set(45.123);
^
foo is: 11234.2
[/quote]

This is what I want my whole execution string to look like (generated after one run (with above errors))

[quote]{ std::function<void(double)> * fun_setter_mangle = reinterpret_cast<std::function<void(double)>*>(140732576095296);
std::function<void(double)> & fun_setter = fun_setter_mangle;std::function<double()> * fun_getter_mangle = reinterpret_cast<std::function<double()>>(140732576095328);
std::function<double()> & fun_getter = fun_getter_mangle;std::function<void()> * printHello_mangle = reinterpret_cast<std::function<void()>>(140732576095360);
std::function<void()> & printHello = *printHello_mangle;
struct MyProxy{
void set (double d) {fun_setter(d); }
double get () {return fun_getter(); }
};
MyProxy foo2;
printHello();
foo2.set(45.123); }
[/quote]

I don’t claim that my way of doing things is right, and I clearly am having trouble with this. Could anyone recommend me a way of doing this right?

I need a mechanism where I can run toExec (the std::array) an unlimited amount of times without cling telling me that I’m trying to redeclare anything. (that’s why in my general use case, it’s surrounded by braces)

I was thinking of somehow copying cling::Interpreter (a full clone, not a child copy), however the copy constructor and assignment operator are deleted/removed, I didn’t notice any clone() or copy() functions in Interpreter.hpp either so I’m once again at a loss.

Any help would be appreciated :slight_smile:
Thanks in advance.

Hi,

Sorry - I’m on vacation and this is an example too complex for me to parse and understand within 10 minutes… Can you strip this down to something very simple, that still explains what you are trying to do? (E.g. - are those sprintf an artifact of what you are actually doing or significant to reproduce the issue you are trying to solve?)

I’m back at work in a week. Please either wait until then or simplify the example and I might get around to have a look or let’s hope that e.g. Danilo or Vassil or another colleague of mine jumps in :slight_smile:

Cheers, Axel.

Hey Axel,

Sorry about that.
I’ll make two posts, one with what I have that can work, and what I want but doesn’t work

this post will contain code that will work for only one cling::Interpreter instance. If i try running it again, I will encounter some error (redefinition) and most of the time, unload will work as expected (see above reference)

As you can see, I provided a reference to the function that modified foo. foo was originally set to 33.334 by the compiled program in main(). but was later modified by cling to 45.123. This is what I want, but as stated once more, can only be done in only one instance

[code]#include
#include
#include <cling/Interpreter/Interpreter.h>
#include

static const std::string PATH_TO_LLVM {“/home/d4nf/Custom_Libs/Root-CERN/root-6.06.00.gcc530/etc/cling”};
//static const std::string PATH_TO_LLVM {“/home/d4nf/Custom_Libs/Root-CERN/root-cern-1-26-15.gcc530/etc/cling”};
static const char * clingArgs = {“”, “-std=c++11”};

struct Mobject
{
void set(double _d) {d = _d;}
double get() {return d;};
private:
double d; // I cannot directly access this
double f; // cling wont always agree with gcc wether d is located before f (check the addresses yourself to verify)
};
int main()
{
Mobject foo; // make a mObject called foo, then set it
foo.set(33.334);
cling::Interpreter inter(2, clingArgs, PATH_TO_LLVM.c_str());
inter.declare(“#include \n”); // we are using std::function so we need to let cling know
std::cout << "foo is: " << foo.get() << “\n”;

// make lamda functions that call public member functions of foo
std::function<void(double)> fun_setter = [&foo] (double d_cpy) -> void { foo.set(d_cpy);};
std::function<double()> fun_getter = [&foo] () -> double {return foo.get();};


// get addresses of our std::functions
uintptr_t setter_addr = reinterpret_cast<uintptr_t>(&fun_setter);

// to prevent constantly reallocating an std::string, we use a large buffer and an index.
// sprintf tells us how many bytes it wrote, and the after, places a null, so every other call to sprintf,
// we will start the next string overwriting the null character
std::array<char, 4096> toExec;

int BytesWritten = 0;
// start a scope and get a non-pointer access ref var to the re-casted std::function that we placed in (via the address)
BytesWritten += std::sprintf( &toExec[BytesWritten], " %s * fun_setter_mangle = reinterpret_cast<%s*>(%llu);",
                              "std::function<void(double)>", "std::function<void(double)>", setter_addr);
BytesWritten += std::sprintf( &toExec[BytesWritten], "%s & fun_setter = *fun_setter_mangle;",
                              "std::function<void(double)>");

inter.process(toExec.data()); // register these std::functions to the interpreter
        std::cout << "\n\nThis is what was supplied to cling::interpreter.process(): \n" << toExec.data() << "\n";
BytesWritten = 0;
// we tell cling what this proxy is, we defined fun_setter() and fun_getter() just above
const char proxyStruct[] =
        "struct MyProxy{"
        "void set (double d) {fun_setter(d); }"
        "}; MyProxy foo2;";
BytesWritten += std::sprintf( &toExec[BytesWritten], proxyStruct);

BytesWritten += std::sprintf( &toExec[BytesWritten], "foo2.set(45.123);"); // in this main we ran foo.set(33.334);
// we expect foo.get() to return 45.123 even though it was called in the interpreter


inter.process(toExec.data()); // we just made a large string, so now we finally have the interpreter process it all.

// inter.process(toExec.data()); // though commented out, my intention is to be able to run this string multiple times without
// inter.process(toExec.data()); // getting conflicts through

// this string shows what was submitted to the interpreter (via process)
std::cout << "\n\nThis is what was supplied to cling::interpreter.process(): \n" << toExec.data() << "\n";
std::cout << "\n\n\nfoo is: " << foo.get() << "\n"; // as stated above, we expect to get 45.123 instead of 33.334

return 0;

}[/code]

output

this is the second post consisting of the behavior that I want, but wont get.

please keep in mind that this sample is identical to the above, with the exception of the following comments

[quote]/////// inter.process(toExec.data()); // register these std::functions to the interpreter
/////// std::cout << “\n\nThis is what was supplied to cling::interpreter.process(): \n” << toExec.data() << “\n”;
/////// BytesWritten = 0;[/quote]
I also added sprintf’s to add in braces

[code]#include
#include
#include <cling/Interpreter/Interpreter.h>
#include

static const std::string PATH_TO_LLVM {"/home/d4nf/Custom_Libs/Root-CERN/root-6.06.00.gcc530/etc/cling"};
//static const std::string PATH_TO_LLVM {"/home/d4nf/Custom_Libs/Root-CERN/root-cern-1-26-15.gcc530/etc/cling"};
static const char * clingArgs[] = {"", “-std=c++11”};

struct Mobject
{
void set(double _d) {d = _d;}
double get() {return d;};
private:
double d; // I cannot directly access this
double f; // cling wont always agree with gcc wether d is located before f (check the addresses yourself to verify)
};
int main()
{
Mobject foo; // make a mObject called foo, then set it
foo.set(33.334);
cling::Interpreter inter(2, clingArgs, PATH_TO_LLVM.c_str());
inter.declare("#include \n"); // we are using std::function so we need to let cling know
std::cout << "foo is: " << foo.get() << “\n”;

// make lamda functions that call public member functions of foo
std::function<void(double)> fun_setter = [&foo] (double d_cpy) -> void { foo.set(d_cpy);};
std::function<double()> fun_getter = [&foo] () -> double {return foo.get();};


// get addresses of our std::functions
uintptr_t setter_addr = reinterpret_cast<uintptr_t>(&fun_setter);

// to prevent constantly reallocating an std::string, we use a large buffer and an index.
// sprintf tells us how many bytes it wrote, and the after, places a null, so every other call to sprintf,
// we will start the next string overwriting the null character
std::array<char, 4096> toExec;

int BytesWritten = 0;
// start a scope and get a non-pointer access ref var to the re-casted std::function that we placed in (via the address)
BytesWritten += std::sprintf( &toExec[BytesWritten], "{ ");
BytesWritten += std::sprintf( &toExec[BytesWritten], " %s * fun_setter_mangle = reinterpret_cast<%s*>(%llu);",
                              "std::function<void(double)>", "std::function<void(double)>", setter_addr);
BytesWritten += std::sprintf( &toExec[BytesWritten], "%s & fun_setter = *fun_setter_mangle;",
                              "std::function<void(double)>");

/////// inter.process(toExec.data()); // register these std::functions to the interpreter
/////// std::cout << “\n\nThis is what was supplied to cling::interpreter.process(): \n” << toExec.data() << “\n”;
/////// BytesWritten = 0;
// we tell cling what this proxy is, we defined fun_setter() and fun_getter() just above
const char proxyStruct[] =
“struct MyProxy{”
“void set (double d) {fun_setter(d); }”
"}; MyProxy foo2;";
BytesWritten += std::sprintf( &toExec[BytesWritten], proxyStruct);

BytesWritten += std::sprintf( &toExec[BytesWritten], "foo2.set(45.123);"); // in this main we ran foo.set(33.334);
// we expect foo.get() to return 45.123 even though it was called in the interpreter

BytesWritten += std::sprintf( &toExec[BytesWritten], " }");

inter.process(toExec.data()); // we just made a large string, so now we finally have the interpreter process it all.

// inter.process(toExec.data()); // though commented out, my intention is to be able to run this string multiple times without
// inter.process(toExec.data()); // getting conflicts through

// this string shows what was submitted to the interpreter (via process)
std::cout << "\n\nThis is what was supplied to cling::interpreter.process(): \n" << toExec.data() << "\n";
std::cout << "\n\n\nfoo is: " << foo.get() << "\n"; // as stated above, we expect to get 45.123 instead of 33.334

return 0;

}

[/code]

[quote]./tcling
foo is: 33.334
input_line_4:2:217: error: reference to local variable ‘fun_setter’ declared in enclosing function ‘__cling_Un1Qu30’
…& fun_setter = fun_setter_mangle;struct MyProxy{void set (double d) {fun_setter(d); }}; MyProxy foo2;foo2.set(45.123); }
^
input_line_4:2:149: note: ‘fun_setter’ declared here
fun_setter_mangle = reinterpret_cast<std::function<void(double)>*>(140729146831104);std::function<void(double)> & fun_setter = *fun_setter_mangle;…
^

This is what was supplied to cling::interpreter.process():
{ std::function<void(double)> * fun_setter_mangle = reinterpret_cast<std::function<void(double)>*>(140729146831104);std::function<void(double)> & fun_setter = *fun_setter_mangle;struct MyProxy{void set (double d) {fun_setter(d); }}; MyProxy foo2;foo2.set(45.123); }

foo is: 33.334

[/quote]

Please let me know if I’m still losing you

Bump

Hi,

Sorry, I’m slow. In the middle of cling’s llvm upgrade…

So here’s what’s happening inside the interpreter: you call

process("{\
  std::function<void(double)> * fun_setter_mangle \
    = reinterpret_cast<std::function<void(double)>*>(140730316799440);\
  std::function<void(double)> & fun_setter = *fun_setter_mangle;\
  struct MyProxy{void set (double d) {fun_setter(d); }};\
  MyProxy foo2;\
  foo2.set(45.123);}");

Currently (until someone finally finds the time to fix it, we know how, no magic, but time…) cling has rudimentary support for distinguishing whether something is a declaration or a statement. That makes a huge difference for cling:

declarations are interpreted as-is, while statements are ill formed on the global scope, and are wrapped in a synthesized function that cling then calls. The heuristic whether input is a declaration or a statement is only looking at the first few identifiers; if it’s “namespace” or “class” or “extern” or “#include” then we’ll declare, else we’ll wrap and call.

Before the latter, we’ll try to extract declarations. Because you will likely want to refer to these declarations from a later transaction, so keeping them function-local doesn’t make sense. But this doesn’t hit you - back to statement versus declaration:

Your code is interpreted as a statement. It’s wrapped, there is a local struct. That uses a function-local variable (fun_setter) - and that’s ill-formed.

So instead please use declare() on declarations (up to and including "MyProxy foo2;) and then use process only on “foo2.set(45.123);”.

Does this make sense?

As I said - we know how to separate decls from statements, but it’s a non-negligible amount of work. Actually - we know of two ways which means we’ll have to implement both and see which one is better - that takes even more time :frowning:

Cheers, Axel.