How to Manage Native C++ Memory in Android (Google I/O '17)


[Music] hello welcome I'm Huntsville I'm a software engineer in the Android runtime group and I'm here to talk to you about how to manage native C++ from memory in Android generally most Android applications are written in Java today maybe Catalan in the future but there are sometimes reasons to also write pieces of the application in C or C++ code it may be the case that you can implement some algorithm more efficiently in the language like C++ or you may have a native library that already exists that you wanted to use or the like here we're really talking about multi-language applications that combines a Java and C++ and it turns out that even if you're in the business of writing a hundred percent Java apps as you maybe some of what I'm going to talk about what may still occur may still be an issue in certain isolated cases for example if you look at the Android platform implementation the big integer implementation that you're using if you bake it using big integer at all is actually implemented in terms of native code underneath and that may in other cases that sort of thing may actually show too so that was one of the reasons I got involved in this I actually spent some time working on the Android calculator app which happens to be a major client of big integer and here we're going to use a running example of a hypothetical C++ package that we want to access from Java language code that manipulates polynomials over GF 2 does anybody know what that means good you're not supposed to it doesn't matter as far as we're concerned these are basically bit vectors with an odd notion of multiplication but I don't even care about that I'm not going to show you the C++ code actually in this this particular example is good because it turns out that in spite of fact that it seems esoteric modern Harvard often has a hardware support for this so you can actually build a blazingly fast really low-level implementation so what we want to do is we have this library and we want to call that library from Java code so what this looks like roughly is the following it's because that you see up here so the important part here is we have a Java class which logically owns a C++ object that actually contains the guts of the implementation that actually implements the the real functionality which I'm not going to show you what the Java object holds is a Java long which is really a C++ pointed in disguise this then what it's going to do is for example when you want to multiply two of these this is eventually going to call this native multiply function it's going to pass it to Java Long's this is going to go into some J&I code which will cast those Java Long's to C++ pointers and manipulate the C++ object and in this particular case we turn another Java long which is really a C++ pointer so the the Java level multiplication routine is really one that just gives me back a new binary poly object containing this Java long which really points to C++ code which I obtained by I obtained the the underlying the underlying C++ pointer by calling this native multiply routine so what here's a pictorial representation of what this looks like I have a the Java object at the top of the slide which is the only thing that the job that my client actually sees so I want the client to be able to treat this Java object as though it were implemented completely in Java and ignore the fact that the C++ code underneath the dot-dash line here is the Java long which is really a C++ pointer which sort of points to the C++ object reference though Java doesn't know that and inside the C++ object representation there may be additional pointers that point to additional C++ data structures so what's the problem with doing this the problem comes into play when we try to think about how objects how memory gets managed and how objects get be allocated on the on the Java language side we have a garbage collector that cleans up after us and we generally don't have to worry about this too much when things are no longer reference they go away when they become unreachable they go away on the c++ side we have a manual memory management discipline usually which means we explicitly need to call some delete functions and bolita in order to deallocate the memory when it's no longer reachable and so how do we actually do that we have to arrange for somebody to call the delete function on the c++ object so here's the traditional way to do this and the point of this talk is largely to talk you out of doing it this way so the traditional way of doing this which has the attraction admittedly that it's relatively simple to write the code compared to what I'm going to show you is that in addition to things like the native multiply method I'll have a native delete method to which I also pass a job along and then the native delete method will again convert that to the C++ pointer and invoke delete on it or invoke the appropriate D leader on it and I will call this I will call that native delete from a java finalizer the finalizer is invoked more or less when the object becomes unreachable so that then goes ahead and invoke the ADT Leeton correctly allocates the c++ object as well I'll show you some reasons why this is problematic here so I'll go through a long list of finalizer of finalizer problems finalizes have a deserved reputation for being hazardous and I'll only confirm that here but I'll actually emphasize some some relatively lesser known problems which in my opinion actually the serious ones to try to get to get come to grips with and then I'll show you how to work around those so the the first problem is that if two objects become unreachable the finalizes actually run in arbitrary order that includes the case in which two objects that point to each other become unreachable at the same time they can be finalized in the wrong order meaning that the the second one to be finalized actually try to access an object that's already been finalized I'll go into some more details there so as a result of that what can happen is that you can get dangling pointers and CD allocated C++ objects the thing to keep in mind here is that as a result of dangling pointers in this environment is that you and very often end up corrupting the c++ c but what's the math the the java language runtime actually relies on the c++ heap as well so what if when this happens often what you end up with is also a compacted Java Runtime and you may end up seeing caches that actually look like Java garbage collector caches or the like so here's to illustrate how this can go on here's a sample client application that actually uses the binary poly class that we just talked about and this is this in combination with the previous class will definitely break so definitely do not do this so what this is doing in its finalized method it's just doing something innocuous on my binary poly which is this field that happens to hold a pointer to the binary poly now the problem here is that it's entirely possible that the my binary poly object actually gets finalized first so what you're accessing here is in my binary poly which actually has already been finalized so by the time you you access it the pointer the c++ object underlying it will have been de-allocated and the native pointer that you're using actually points to nowhere so that's a bad thing there's the second problem with finalizes which is actually it turns out to be more complicated to work around in it's less employ less important on Android but it's generally important when writing java language code so by Java language rules and this is not currently – on Android and not currently believed to be true on Android and object X is finalized it may actually be invoked while one of X's methods is still running so while while I'm still running a method on that object that object may end up getting finalized this can result in the same sort of phenomenon I get native heat corruption as and as a result possibly Java Runtime corruption let me explain why that happens so here's sort of again an excerpt from the binary poly class I have I used native mobile I use multiply before I'll use it again here's I have this native multiply method that gets called by multiply if you look at what this actually gets compiled to and this isn't real this is sort of pseudocode pseudocode it gets compiled to look something like this so what happens when I want to multiply to two values I first end up leaving the the native handles from both of them in this case from this and I may put this in here explicitly to make that clear so I repeat and native handle from this and I repeat and they don't handle from other then I might allocate the new binary poly object and then I will go ahead and call the native method so the problem with this is if we look at the uses of the actual Java objects the last use of this and other actually happened before allocate the new binary poly at that point from that point on this method doesn't use either this or other anymore and it may so happen that this is in fact the last call on those so this is in fact the point at which those become garbage and the garbage collected notices that they're no longer teachable so what can happen at that point if the if the ad is designed to allow this and if all the stalls line up just right it can in fact happen that the garbage collector at that point besides that these are no longer that this and other no longer needed and arrange for the finalizes on both of those to get it vote roughly where the new binary poly happens occurs what then happens is that when we call native multiply we don't need the objects anymore but we still need the native handles but now the native handles that would be allocated so in fact I'm accessing objects that are no longer around this is allowed by the Java language specification and it's something that that has been seen on the ground occasion in the wild but it doesn't happen very often there are more problems with finalizes you can see a lot of them by looking at the joshua blocks effective java book which actually has a section on the on entitled I believe avoid finalizes one one thing to indicate the problems is that it's actually the plan as I believe to deprecate finalized and finalized in JDK 9 another issue which I look at occasional here a little bit is that for these for this to work correctly and if you run an application that allocates lots of native memory and relatively little Java memory it may actually not be the case that you the garbage collector runs fondly enough to actually invoke finalizes or the other mechanism also just in a minute here so you actually may have to invoke these systems RTC and system to run finalization occasionally which is tricky to do because if you do it too much it will greatly slow down your application and many people have fallen into that trap there's a more subtle issue that they sometimes finalize those actually extend the lifetime of the Java object for another garbage collection cycle which means for generational garbage collected they may actually cause it to survive into the old generation that it may be the lifetime may be greatly extended as a result of just having a finalizar and there are other issues which I won't go into here so how do you really delete C++ objects I should point out the usual advice here which I'll skim over briefly is to use explicit clothes and tie with resources so Java has a syntactic facility which allow us essentially C++ style distraction when you leave a scope you can arrange to to explicitly call the close function and that works when it's applicable so there cases like file like object system allocated objects and torn with it's generally that I approach and that's the main recommendation on the other hand there are many cases in my experience where that doesn't work and in general people already use this wing in those cases where it where it's applicable so for example if I mention C is the java.

lang big integer and example in the platform you really don't want to have to call big integer that close every time one of those goes out of scope that would be completely untenable and there are many or more examples like that in the Android platform I should warn you that for what are the solutions I'm talking about a lot of this is actually not fully settled in the community as a whole this is sort of beyond Android as well it's still trying to figure out what the right way to do this is these are mostly sort of general Java language issues that are actually not specific to Android but there seems to be it seems to be agreement that you shouldn't use object not finalized as evidenced by the fact that it's going to be deprecated sometime in the future so the advice here and I'll go into how to do this is we use something called java.

lang graph phantom reference and many of you may have encountered that as far as I can tell most people look at it and ignore it because it's it's somewhat complicated probably appears even more complicated than it actually is it's not really commonly used but it actually is a better replacement for finalizes it avoids the problem that the finalizar can see finalized objects because of the ordering issues phantom references ensure that you run the cleanup code only when the object really is about to go away and nobody can use it and nobody can see it anymore it suddenly deals with finalized application issues it's not going away it also ends up dealing with some of the more subtle issues though not all of them it does not feel with the premature cleanup issue that I mentioned earlier that an object can be finalized while one of its methods is still running for example the major difficulty with using it and I'll go through an example here is that it's relatively complicated to use at the moment and we're in the process of making that better I think various groups are in the process of making that better so Java 9 actually has this notion of a cleaner which makes this a little bit simpler weave inside the platform we actually have something called native allocation registry that deals with some of this this is at the moment not a public API but if you're interested in this we're trying to assess whether it would be good to make that a public API or whether and whether this is the right API to do that with so what should you use instead well I mentioned these phantom references so what are they a phantom reference is shown here with the ghost next to it it's an object that's associated with the object whose lifetime you want to monitor so it doesn't actually it doesn't actually point to or equal to the the binary poly object that we want monitored and then we want to be cleaned up after so you can sort of think of the phantom reference as the last will and testament of the years of Java object it tells you what to do how to clean up the object once a die once the Java object dies and so both the the way this works normally in order to actually use a phantom reference you'll usually inherit from it and the inherited derived class the Class VI from Phantom reference will have a pointer to the C++ object along with the the Java binary poly object there's also a reference queue off to the side here which I'll show you what all that plays in a second here notice that I put those grounds symbols in here in a few places to indicate that these are those are objects that we need to make sure don't get garbage collected so we need to have some mechanism for keeping those around that applies to both defense and reference and the reference queue so what happens after binary poly becomes unreachable the phantom reference itself gets added to the associated reference queue and that's basically all that happens the Java object gets immediately collected there's no longer any need for it because the phantom reference itself knows how to clean up so I'll show you a sort of quick implementation of this year in in this case this pub actually is fairly easy so we've modified binary poly to deal with this sort of reference cleanup and what I've done is and on the next slide I'll show you BP phantom reference implementation that inherits from phantom references the kind of phantom reference that specialized with some additional functionality here and what this what this does here is we still have the native delete method that actually he allocates the C++ object when I whenever I allocate a binary object and now do it through this factory method which allocates via which constructs and allocates the object but then immediately goes and registers it through a static method in VP phantom reference the way I've implemented it here now what's pp phantom reference so that fits on a slide in the same in the same way that in the sense of the old Midas commercial if you remember that but I fetched it to fit it on a slide here so what that does is it introduces a couple of static data structures one of them is the actual reference cue which will be used to in cue phantom these PP phantom references once the corresponding java object is no longer needed and I also have a need to have a concurrently accessible set here some way to just keep the BP phantom references around so they don't get garbage collected themselves so I just keep those around as long as basically until they explicitly removed and so what I do then is whenever I whenever I allocate whenever I can select one of these PP phantom references I first of all can select the Phantom reference giving it the Java object and the reference cue so that this helps the underlying phantom reference implementation to watch that Java object and put it onto the cue that I gave it when the Java object goes away and then I also remember the native handle so I can actually invoke the allocation function when it's time so what happens when I register one of these things I register Java object and the corresponding native handle I create a new BP phantom reference and I added to my set here just to make sure that it doesn't go away and then and this is sort of part of the ugliness of the scheme unfortunately that we're trying to address at some point every once in a while I I actually need to arrange that everything that's on the reference queue gets deleted and I've done that here by providing a method do deletes which just checks whether there's anything available for deletion and if so it goes ahead and deletes that in other context you may want to do this differently rather than just polling whether something is available you may actually want to template fed for this and block it sort of depends on the context it's a little bit tricky here to do this well in a loft system because the easy way to do this is to create a new thread for every class that does this which may or may not be acceptable depending on how many of these you have so then in order to actually make this work my application needs to do the following things periodically in simple cases you can skip the system GC and system dot run finalization if you notice that the garbage collector isn't running enough because you're not allocating enough Java objects in order to actually trigger it you may have to explicitly do that but usually that's not an issue but you do have to regularly call BP plan time reference not do deletes to actually do the cleanup the way I've done things here so the next problem here that I that I mentioned is the premature and queuing while a method is still running as I said this is not actually an issue for Android but it is an issue for portable code so there's a partial solution to this in actually in Java 9 which is this thing called a teachability fence which you can explicitly invoke to tell the implementation don't let the argument go away yet it's still live to the should still be live to the garbage collector even though it might not look it might not look like it and that's not really available basically in any implementation that you can use at the moment so the best solution that my colleagues and I could come up with at the moment is the following which is relatively simple but not exactly performance neutral is that is that instead of having simple methods like multiply that just invoked the native method with the native handles what we do instead is we invoke the native method with both the native handles and the java object so what I do instead is I have a native multiply that takes the 2j object as well and current implementations though the spec doesn't 100% guarantee that current implementations are essentially essentially guarantee that this and other turns into a local graph which tells the garbage collector to keep these around as long as the native method is running so things will actually work out correctly at the expense of passing additional parameters to J&I again this is recommended if you have to write for if you write portable code for Android I currently wouldn't recommend doing that eventually sometime in the future I think we'll have a better we will probably have a better solution to this so one more hazard that I want to go over quickly in pulse because this actually this is near and dear to my heart because it sort of took several months of several people's time to debug platform code that had this issue if you're using C++ code in this way you need to be really careful that you're actually calling it correctly and not violating the C++ levels underneath and that can be quite tricky to do correctly so in particular if you're calling C++ code from Java sets it still has to be fed safe as though you were calling it for multiple C++ beds and you have to make sure to follow those rules this is a bit aggravated by the fact that some some android framework classes actually use C++ code internally so this gets back to the point I was making at the beginning that you have to make sure to use those frameworks classes correctly because otherwise you can run into this problem even without actually writing native code and one particular issue one issue that's particularly subtle on the C++ side that actually contributed to this problem that we spend lots of time on is that often on the C++ side when you call the leaf you actually end up invoking some reference counting mechanism that then takes care of the underlying C++ objects that I indirectly reference you have to be you have to be really careful in that doing that correctly I highly recommend using an expert developed implementation of reference counting not to all your own I spend a lot of time recently fixing bugs in Android platform reference counting implementations so it's not too likely I think that most people will get this right implementing it themselves there are whole bunch of different issues here having to do with memory ordering bugs and self assignments and so on it only is this tiny amount of code but it's a really tricky tricky piece of code but assuming you have a correct reference counting implementation it's still hard to use it correctly so one thing you have to remember for something like the walls vary a little bit depending on the implementation but for something like stitch footer when you create your shell put it to something the object will will be de-allocated when the reference counts associated with copies of that shared Potter goes to zero so creating multiple shell footers to the same underlying to the same corresponding to the same underlying bare pointer it's not likely to work well it's also not likely to work well to generate a reference counter pointer to this and they can selectively for similar sort of reason because probably by the time you leave the constructor that reference count will have gone to zero and you will be allocate the object before you ever leave the constructor which is not good so you have to be careful with that sort of thing the dead safety walls are in some ways even more subtle and something and also something that you really want to watch out for so assuming here I have a bunch of shared photos it's actually okay according to the normal C++ thread safety rules if I take X the shell put X and copy it simultaneously in two different threads to two different other pointers this will tie to simultaneously update the reference count associated with X but that's okay it's the implementations job to make sure that that works correctly so that's that's fine what's not fine and what in fact calls the sort of law extending bugs that we have to deal with here is the last thing on the slide here simultaneous assignments to the same reference counted pointers if I simultaneously assign even null to the same reference counted pointer that looks pretty benign I'm assigning the same thing concurrently to two different threads which what could go wrong the problem is its disallowed by C++ rules for good reason because what happens here is both threads will try to simultaneously the allocated decremented reference count associated with the original value of P if they race each other just right so they both do this before either one actually assigns no to the pointer they will end up both decrementing the reference count which was originally had a value of 1 because it was referenced by P that doesn't go well in the excellent reference count that Commander's is always there so summarizing here important point is avoid finalization use java.

lang that phantom reference instead currently that's a little bit clumsy but it boyd's you also the source of potential headaches it's if you want to read up on this more I suggest you actually look up the the Java 9 reach ability fence contact and discussion that'll give you a little more insight as to why this is why some of these things are problem and particular why you have to worry about premature premature cleanup with the Java semantics again not so much on Android and then stay tuned for future improvements in this area here if you allocate a lot of C++ objects if that's where most of your heat space ends up getting spent then you should think about explicit DC triggering but doing it really carefully make sure you keep track of how much C++ memory you've allocated and when you've allocated a lot so you think it might be useful then you can invoke the garbage collector so phantom references for dead objects getting queued at that point you can actually do the C++ cleanup at that point otherwise if you know how not allocating any Java memory usually the Java garbage collector will not get otherwise get triggered and be careful with C++ memory management and understand the walls so we actually have a little bit of time for questions here if there are any that's the end of it thank you very much for attending [Applause] [Music].

Leave A Reply

Your email address will not be published.