
Dailydave mailing list archives
Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case)
From: "Piotr Bania" <bania.piotr () gmail com>
Date: Mon, 18 May 2009 20:20:20 +0200
Yo,
I've got a few questions regarding your approach. 1) In section 4.4 you discuss predicting data propogation and you use the term 'symbolic execution'. Does this mean you treat all input as symbolic? e.g. everything from a recv() call is marked as 'tainted' 2) If the answer to the previous question is 'yes'; how do you deal with symbolic read/writes using your O_in/O_out register mechanism? I can't see this working for memory, as the size of those sets becomes potentially unbounded (well, bounded by the amount of usable memory) e.g how do you describe the memory written to by **mov dword ptr [eax], ebx** if eax is symbolic and dependent on user input? A more tangible situation might be the case where a child object is created, then written to memory at a symbolic offset and then later read again.
First of all, SpiderPig in current shape requires the user to pick the starting point (that was the main idea of it from the beginning). In other words user must specify the register or memory region which is tainted (pick a root of the taint). SpiderPig can taint either memory ___location or CPU elements like registers, flags etc. etc. Regarding the 4.4 section (Predicting Data Propagation) the symbolic execution approach (O_in/O_out variants) refers only to the elements of the CPU architecture - not the memory locations pointed by them. So if SpiderPig meets instruction like "mov dword ptr [eax], ebx", the "ProcessStandardInstruction()" function (see Algorithm 1, page 22) is used. So basically in this case it does following thing: 1) kills the 4 byte memory region pointed by EAX (saves all the information about the killer instruction) 2) if the EBX value is tainted then the 4 byte memory region pointed by EAX is also tainted To preserve some time the referenced memory address (in this case pointed by EAX) is computed by the instrumentation code (on the fly) inside of target process. Now if there are any possible data propagations afterwards in the Dataflow Region between CPU elements, the symbolic execution approach is used. I think it is important to notice that each Dataflow Region is considered as side-effect free (see Definition 4, page 24).
3) What DynamoRIO plugin are you comparing your code to?
If you refer to "Test application's performance" (page 39), it was a very simple plugin (made by myself) which task was to gather and save a CPU context for each executed instruction. Like i have stated in 5.2.3 section ("Analysis (Instrumentation) Performance", page 38) there is nothing really to compare. VCI is VCI, DBI is DBI and IMHO they should be treated separately. Shortly in case of VCI i dont need to waste time for dispatcher calls "every"* transfer instruction. Anyway personally DynamoRIO is my favorite DBI so far, and i really admire Derek and rest of the authors for providing such an excellent tool. I think it is quite possible I will port SpiderPig to DynamoRIO, especially after it became open source project[1].
Cheers, and good work,
I'm glad you liked it and i hope i have answered your questions. cheers, pb * i am aware DynamoRio has some optimizations for that [1] - http://code.google.com/p/dynamorio/ On Mon, May 18, 2009 at 1:32 PM, Piotr Bania <bania.piotr () gmail com> wrote:
SpiderPig is a project created for performing and visualizing data flow analysis of a selected binary program. SpiderPig was created in the purpose of providing a tool which would be able to help vulnerability and security researchers with tracing and analyzing any necessary data and it's further propagation. Such tasks are very often crucial in the vulnerability discovering/identifying process and typically require a lot of time consuming manual work. Following paper discusses methods and techniques implemented in SpiderPig in order to perform semi-automatic data flow analysis. Paper is available here: http://piotrbania.com/all/spiderpig/pbania-spiderpig2008.pdf Simple video demo and some other things available on project website: http://piotrbania.com/all/spiderpig/ best regards, Piotr Bania -- -------------------------------------------------------------------- Piotr Bania - <bania.piotr () gmail com> - 0xCD, 0x19 Fingerprint: 413E 51C7 912E 3D4E A62A BFA4 1FF6 689F BE43 AC33 http://www.piotrbania.com - Key ID: 0xBE43AC33 -------------------------------------------------------------------- - "The more I learn about men, the more I love dogs." P.S Did ya know adult pigs can run at speeds of up to 11 miles an hour? _______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com http://lists.immunitysec.com/mailman/listinfo/dailydave
-- http://www.unprotectedhex.com http://www.smashthestack.org _______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com http://lists.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case) Piotr Bania (May 18)
- Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case) nnp (May 18)
- Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case) Piotr Bania (May 18)
- Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case) nnp (May 18)
- Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case) Fosforo (May 18)