Learning C# / loading binary file into struct best method?

Discuss anything programming related.
Locked
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Learning C# / loading binary file into struct best method?

Post by neodos »

Hi!

I have previously made a tool to extract models from an xbox game (JSRF) in VB.net, and this time I want to do another tool for another game but I am trying to do things better hence using C#.

I have been looking into loading binary files to structures but I am not sure which is the best method.
Basically I am just going to write and rewrite the struct while I figure out the actual game files structure so the method has to be "flexible".

Any advice on how to do this best?

Thanks!
User avatar
Click16
Posts: 1941
Joined: Mon Dec 31, 2007 4:36 am
Location: United States

Re: Learning C# / loading binary file into struct best metho

Post by Click16 »

What are these binary files? I am not sure what you mean by binary files and structures. Can you specify, and maybe provide an example?
Image
User avatar
XZodia
Staff
Posts: 2208
Joined: Sun Dec 09, 2007 2:09 pm
Location: UK
Contact:

Re: Learning C# / loading binary file into struct best metho

Post by XZodia »

They isnt really a best method, it just depends one what your using it for.
Image
JacksonCougar wrote:I find you usually have great ideas.
JacksonCougar wrote:Ah fuck. Why must you always be right? Why.
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

I want to write a source engine model decompiler,model files .mdl, .vtx, vdd, so vertex data, weight maps, bones, animation etc mostly ints, floats, strings and blocks of data and pointers.
But the blocks of data can vary in size and number.


Honestly I am just trying to find an example of loading a binary file into a structure so I can start get started on researching the model structure and decompiling, I have just been stuck into putting the binary file into a simple struct :S
User avatar
troymac1ure
Keeper of Entity
Posts: 1282
Joined: Sat Aug 09, 2008 4:16 am
Location: British Columbia, Canada, eh
Contact:

Re: Learning C# / loading binary file into struct best metho

Post by troymac1ure »

2 options.
Either create a struct/class and have a load function that reads each individual member in (preferred way and the only way I've used)

Code: Select all

public class Data
{
   int id;
   float margin;
   int offset;

   public void readStruct(MemoryStream ms)
   {
       BinaryReader br = new BinaryReader(ms);
       id = br.ReadInt32();
       margin = br.ReadSingle();
       offset = br.ReadInt32();
       br.Close();
   }
}
or
Read in a binary file to block of memory. I've never done this is c# and people advise against it as c# pads values and aligns to memory boundries, etc. but I found this online, but it will only work fine as long as you are not using pointers or arrays:
src=http://stackoverflow.com/questions/4159 ... inary-file

Code: Select all

public static class StreamExtensions
{
     public static T ReadStruct<T>(this Stream stream) where T : struct
     {
         var sz = Marshal.SizeOf(typeof(T));
         var buffer = new byte[sz];
         stream.Read(buffer, 0, sz);
         var pinnedBuffer = GCHandle.Alloc(buffer, GCHandleType.Pinned);
         var structure = (T) Marshal.PtrToStructure(
             pinnedBuffer.AddrOfPinnedObject(), typeof(T));
         pinnedBuffer.Free();
         return structure;
     }
} 

//...
  SomeStruct s = stream.ReadStruct<SomeStruct>();
This first method is probably the one you will want if you need to use pointers, arrays & strings. It's a bit annoying having to add the lines for each, but if you add them as you update your variables it should be straight forward anyways.
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

Thanks a bunch!
I went for the first its working though I am wondering, there must be a way to load things automatically by just referencing the structure?
Is there a way to basically load the file into the structure order and by value type instead of doing a

Code: Select all

mdl.value = br.ReadType();
for each value in the struct ?

As of now I am doing it like so:

Code: Select all

 private void ReadMDL()
        {
             BinaryReader br = new BinaryReader(File.Open(textBox_mdlpath.Text, FileMode.Open));

            stmdl mdl = new stmdl();

            while (br.BaseStream.Position < br.BaseStream.Length)
            {

                br.BaseStream.Position = 0;
                mdl.type = new string(br.ReadChars(4));
                mdl.version = br.ReadInt32();
                mdl.checksum_vertex = br.ReadInt32();
                mdl.name = new string(br.ReadChars(64)).Trim('\0');
                mdl.dataLength = br.ReadInt32();

                mdl.eyeposition = ReadVector3(br);
                mdl.illumposition = ReadVector3(br);
                mdl.hull_min = ReadVector3(br);
                mdl.hull_max = ReadVector3(br);
                mdl.view_bbmin = ReadVector3(br);
                mdl.view_bbmax = ReadVector3(br);

                mdl.flags = br.ReadInt32();
                mdl.bone_count = br.ReadInt32();
                mdl.bone_offset = br.ReadInt32();
                mdl.bonecontroller_count = br.ReadInt32();
                mdl.bonecontroller_offset = br.ReadInt32();
                mdl.hitbox_count = br.ReadInt32();
                mdl.hitbox_offset = br.ReadInt32();
                mdl.localanim_count = br.ReadInt32();
                mdl.localanim_offset = br.ReadInt32();
                mdl.localseq_count = br.ReadInt32();
                mdl.localseq_offset = br.ReadInt32();
                mdl.activitylistversion = br.ReadInt32();
                mdl.eventsindexed = br.ReadInt32();
                mdl.texture_count = br.ReadInt32();
                mdl.texture_offset = br.ReadInt32();
                mdl.texturedir_count = br.ReadInt32();

              //etc more lines of this kind
                break;
            }
       
            br.Close();
          
        }
User avatar
XZodia
Staff
Posts: 2208
Joined: Sun Dec 09, 2007 2:09 pm
Location: UK
Contact:

Re: Learning C# / loading binary file into struct best metho

Post by XZodia »

There is a magic thing called reflection which generalises the procedure, but its a bit of a pain in the ass to setup.
Image
JacksonCougar wrote:I find you usually have great ideas.
JacksonCougar wrote:Ah fuck. Why must you always be right? Why.
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

Thanks, well if you say its a pain to setup I trust your experience and I'll keep doing it the simple way :XD: it just seemed a little redundant to make a class struct and then having to load each item one by one very much like an array.
Last edited by neodos on Tue Jun 05, 2012 5:06 pm, edited 1 time in total.
User avatar
Prey
Posts: 129
Joined: Sat Dec 29, 2007 5:06 pm
Location: UK

Re: Learning C# / loading binary file into struct best metho

Post by Prey »

If your going to keep changing the class members then it can be useful to use reflection. Here's some code snippets:

Includes

Code: Select all

using System.IO;
using System.Reflection;
Reflection class

Code: Select all

    public static class ReflectionUtils
    {
        public static void Read(object classToFill, BinaryReader br)
        {
            // use reflection to get all the public instances in the class
            FieldInfo[] fields = classToFill.GetType().GetFields(BindingFlags.Public | BindingFlags.Instance);

            for (int i = 0; i < fields.Length; i++)
            {
                // check for array first, as we treat them differently
                if (fields[i].FieldType.IsArray)
                {
                    // get the type of field the array holds (int, byte etc)
                    Type elemType = fields[i].FieldType.GetElementType();

                    // get the array instance
                    Array array = (Array)fields[i].GetValue(classToFill);

                    // loop the entire array and fill it with values
                    for (int j = 0; j < array.Length; j++)
                        array.SetValue(ReadType(elemType, br), j);

                    // place the now filled array back in the class
                    fields[i].SetValue(classToFill, array);
                }
                // else just set the fields value depending on the field type
                else fields[i].SetValue(classToFill, ReadType(fields[i].FieldType, br));
            }
        }

        private static object ReadType(Type typeToRead, BinaryReader br)
        {
            if (typeToRead == typeof(byte)) return br.ReadByte();
            else if (typeToRead == typeof(char)) return br.ReadByte();
            else if (typeToRead == typeof(short)) return br.ReadInt16();
            else if (typeToRead == typeof(int)) return br.ReadInt32();
            else if (typeToRead == typeof(float)) return br.ReadSingle();
            // add more types here...
            else throw new Exception("not supported");
        }
    }

Example usage:

Code: Select all

        static void Main(string[] args)
        {
            // generate data
            Random rdm = new Random();
            byte[] data = new byte[1000];
            rdm.NextBytes(data);
            MemoryStream ms = new MemoryStream();
            ms.Write(data, 0, data.Length);
            ms.Position = 0;

            BinaryReader br = new BinaryReader(ms);

            // populate class using reflection
            Header h = new Header();
            ReflectionUtils.Read(h, br);

            br.Close();
        }
    }

    public class Header
    {
        public int a;
        public int b;

        public char[] _str = new char[10];
        public string str { get { return new string(_str); } }
    }
Using this method you can't have string fields as the program wouldn't know how many characters to read in from the stream. So instead in the Header class I use reflection to fill the '_str' char array, then I can use the 'str' property to view the char array as a string.

This stuff is useful for quick testing of values if traversing a hex editor is not as easy, which can be the case when you've followed some deeply nested reflexive. But, for production code, I would convert to the 'a = br.ReadInt32()' b = br.Read...' etc for speed and maintainability.
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

Aamzing, thanks so much for these examples I am going to try that, so you are saying that the reflection method is not really recommended and I should stick to loading each value individually in the br once I know the structure fully?
User avatar
Prey
Posts: 129
Joined: Sat Dec 29, 2007 5:06 pm
Location: UK

Re: Learning C# / loading binary file into struct best metho

Post by Prey »

Yes, as it is a lot faster; reflection requires traversing of the application's IL code which will take longer to do.
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

Good to know, thanks a lot!
User avatar
OwnZ joO
Posts: 1197
Joined: Sun Dec 09, 2007 4:46 pm

Re: Learning C# / loading binary file into struct best metho

Post by OwnZ joO »

Prey wrote:Yes, as it is a lot faster; reflection requires traversing of the application's IL code which will take longer to do.
The plugin system I wrote for Excalibur traversed fields only on instantiation of the object and added objects that all implemented functionality to a list of fields. But yes, reflection is generally slower.
MickLH
Posts: 5
Joined: Sat Aug 31, 2013 12:23 am

Re: Learning C# / loading binary file into struct best metho

Post by MickLH »

Uh... guys? Just use C++ ?
If you are trying to read a chunk of actual binary data, you don't want to be calling all the way up virtual function trees (inside a VM :roll: ) for every couple BYTES of the file! This is only necessary when the data itself is in a different endian than the CPU, which is almost never the case with binary files (for the same reason!)

If you need to use C# for your app, make a C++ library to read and process your binary data quickly, and offer it in a convenient format to [DllImport]

Code: Select all

// Dead simple to define data layout
typedef struct {
	h2_object_info Object;
	u32 ItemFlags;
.............
	int str_SwitchedTo;
	int str_SwapAI;
	h2_dependency unknown1;
	h2_dependency CollisionSound;
	h2_reflexive PredictedBitmaps;
	h2_dependency DetonationDamageEffect1;
...............
} h2_weap_meta_t;
h2_weap_meta_t weap;

// Dead simple to read entire data at once
fread(&weap, sizeof(weap), 1, mapStream);
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

If you need to use C# for your app, make a C++ library to read and process your binary data quickly, and offer it in a convenient format to [DllImport]
That's doesn't sound very productive nor optimal thing to do when you can directly properly do it in C# and learn more about that language, which is the point of this thread, I am not trying to have super performance.

Do you realize that learning a new language in order to write a binary file parser would take me longer than actually just trying to code it with what I currently know (C#)? It would be pretty stupid and counter productive to try to learn a new language like C++ when I am not even half way with C#.

I might switch to C++ the day I start to see the limitations of C# and need extra performance, for now C# does the job pretty damn well, I am not trying to code a game engine nor a super stable software for a space rocket here.

Anyways, I ended up making a class for the file structure and made a function that takes a bytes array and reads/fills the fields by type.
Yes maybe C++ already has its own built in function to read a binary file into a structure, but that's not good enough, even if C# might have its own built in function to to do that as well its not what I needed.
I needed to write my own parser so that I could properly parse certain specific data types like vectors into a custom class or yet again parse the strings that have a specific length and which need some cleaning.

This is what I ended up with, maybe it can be optimized, or maybe the code can be shorter and cleaner, but for me it does the fucking job, I don't need to worry about it once its coded, I just use the function and I don't need to worry about performance when it loads a 500kb file in less than a hundredth of a second.

Code: Select all

        
        public static Object binary2_struct(byte[] bin, Type objType)
        {
            //create instance of object from objType
            Object obj = Activator.CreateInstance(objType);

            int i = 0;

            foreach (PropertyInfo prop in obj.GetType().GetProperties())
            {
                Object value = new Object();
                value = prop.GetValue(obj, null);

                string t = prop.PropertyType.Name;

                switch (t)
                {
                    case "Int32":
                        prop.SetValue(obj, BitConverter.ToInt32(bin, i), null);
                        i += 4;
                        break;
                    case "Int16":
                        prop.SetValue(obj, BitConverter.ToInt16(bin, i), null);
                        i += 2;
                        break;
                    case "Vector3":
                        prop.SetValue(obj, brReadVector(bin, 3,i), null);
                        i += 12;
                        break;
                    case "Vector2":
                        prop.SetValue(obj, brReadVector(bin, 2, i), null);
                        i += 8;
                        break;
                    case "Float":
                        prop.SetValue(obj, BitConverter.ToSingle(bin, i), null);
                        i += 4;
                        break;
                    case "String":
                        int length = prop.GetValue(obj, null).ToString().Length;
 
                        prop.SetValue(obj, Encoding.UTF8.GetString(bin, i, length).Trim('\0'), null);
                        i += length;
                        break;

                }
            }

           // br.Close();
            return obj;

        }
// Dead simple to define data layout
Its not like its any harder to define a file structure in C#.

Code: Select all

private Int32 dataLength;
        private Vector3 eyeposition;
        private Vector3 illumposition;
        private Vector3 hull_min;
        private Vector3 hull_max;

        private Vector3 view_bbmin;
        private Vector3 view_bbmax;

Don't get me wrong, I know C++ is way better in many ways in terms of performance optimization.
But clearly programming is not my profession here, I am a 3D artist and C# suits my needs very well, C# is much more accessible and easier to learn, C# also comes with a lot of default libraries which provide a lot of functionality and save time, where as learning and using C++ would take me too long and I don't need that level of complexity to achieve the same results.

I started coding/programming from stuff like html, css, javascript php, python then to VB.net and later C#, maybe one day I'll need more performance and see limitations in C#, then I'll move to C++, until then C# suits me perfectly and no I am not being lazy or ignorant by not trying to learn C++, but I don't have the time to learn a new language as advanced as C++, while C# suits very well I also need to keep learning this language until I know more about it before moving into another one.

The only way I am going to see the limitations of C# is learning more about it and experimenting more.
I will also gain the experience from (almost, I guess you can't learn it all) fully learning this language (before moving to another) that I'll be able to use and compare to whatever I learn next and then be able to see what were its advantages and disadvantages.
MickLH
Posts: 5
Joined: Sat Aug 31, 2013 12:23 am

Re: Learning C# / loading binary file into struct best metho

Post by MickLH »

edit: I figured it out while writing this: "you want a scripting language with weak typing."

No offense intended friend. I'm just answering honestly and accurately instead of throwing my opinions into the thread. The "best" can only be quantified objectively if there is some hard metric to back it up. The only relevant hard metrics in this case are:
  • 1. Waste of the machine.
    2. Waste of your typing.
I see you didn't really mean "best" but hear me out...
That's doesn't sound very productive nor optimal thing to do when you can directly properly do it in C# and learn more about that language, which is the point of this thread, I am not trying to have super performance.
The best method in this case would be calling the operating system directly and using block IO. That is the important part, that makes it objectively the best: Using the OS's best method without a heavy series of wrappers around it.

(If you want to get really nitty gritty, you should memory map the file and then just calculate your pointers to your structures and you are done, no loading necessary, a good OS will even use available info to implement lazy loading for you.)
Do you realize that learning a new language in order to write a binary file parser would take me longer than actually just trying to code it with what I currently know (C#)? It would be pretty stupid and counter productive to try to learn a new language like C++ when I am not even half way with C#.
Do you realize that you don't know what you don't know? (And if you are thinking right now that you know what I am hinting at that you don't know, you missed it completely) You are too naive to understand the entire classes of benefits of knowing the machine you are working on and the language that system designers are using. By simply knowing C++ you will have much easier insight into binary structure since they are almost always designed by coders (not scripters who tend towards text formats). So all you _really_ wanted was a "good" method, not the "best", so don't get offended at me for answering accurately.
I might switch to C++ the day I start to see the limitations of C# and need extra performance, for now C# does the job pretty damn well, I am not trying to code a game engine nor a super stable software for a space rocket here.
I don't see any reason you would _have_ to switch, even for those tasks, math is math. A better algorithm always beats a better compiler trick.
Anyways, I ended up making a class for the file structure and made a function that takes a bytes array and reads/fills the fields by type.
Yes maybe C++ already has its own built in function to read a binary file into a structure, but that's not good enough, even if C# might have its own built in function to to do that as well its not what I needed.
I needed to write my own parser so that I could properly parse certain specific data types like vectors into a custom class or yet again parse the strings that have a specific length and which need some cleaning.
Please have an open mind: This is where you show that you don't know what you don't know.
  • 1. I did not use the C++ function to read in binary data, that's just as poor as the C# interface. The benefit of using C++ is that you are also using C, you can directly call the OS instead (or through the FILE* wrappers) which is the only candidate for being considered the "best" given that any other approach is going to eventually boil down to this.

    2. If you read my example, you should notice all my nested classes (h2_reflexive, h2_dependency) are loaded all in that same one shot. The actual "stupid and counter productive" thing here is writing tons of shitty code that you will throw away instead of re-using because you don't know what you are doing and don't want to learn how to learn because I guess your way of learning is the best despite you needing to go to obscure forums to ask for help.
(hehe yeah, I just said read() is actually exactly what you needed but you are just too nub to even see that.)


No need to comment on the code, plenty of examples in this thread, yours is just fine man, "if it works it works"
Don't get me wrong, I know C++ is way better in many ways in terms of performance optimization.
But clearly programming is not my profession here, I am a 3D artist and C# suits my needs very well, C# is much more accessible and easier to learn, C# also comes with a lot of default libraries which provide a lot of functionality and save time, where as learning and using C++ would take me too long and I don't need that level of complexity to achieve the same results.
It only takes too long if you have that attitude, so stop putting your own life down with negativity. If you learned C# you basically already know C++, provided you are decent with arrays.

...and well... programming IS my profession :P and I'm only taking the time to explain because I think it can help.
As far as libraries, you are simply talking about the .NET framework, which is absolutely available on C++.
BUT! in addition to providing all your .NET functions, AND easier interface to the data you are working with... the VAST majority of real-world libraries ONLY work on C++ and often the best you can do in C# is using a so-so wrapper with poor documentation. That's not accessible and SURE AS FUCK not easier. But I guess like you said you will have to see the limitations yourself, although it's sounding like you don't even want to program, you want a scripting language with weak typing.
The only way I am going to see the limitations of C# is learning more about it and experimenting more.
I will also gain the experience from (almost, I guess you can't learn it all) fully learning this language (before moving to another) that I'll be able to use and compare to whatever I learn next and then be able to see what were its advantages and disadvantages.
If you know the grammar and you can portray your points and understand others, you have "fully" learned the language. You are practicing now to get better at actually using the language. Don't be scared man!

tl;dr = I'm not trying to get you off C#, I'm trying to show you the "best" method you asked for, not just another "easy" method since you said you want to LEARN.

(PS I would think you would use google if you really just want an easy method to copy paste into your C# app)
User avatar
neodos
Posts: 1493
Joined: Sun Dec 09, 2007 8:58 pm

Re: Learning C# / loading binary file into struct best metho

Post by neodos »

To be honest, it just very strange to see some random person create an account and register in some random forum and reply to a one year old thread, while the thread is clearly a question about C#, title "learning C#" (not C++ !) and the question/problem had been solved, your first post in the forum and thread is "Uh... guys? Just use C++ ?".

I just think its completely random and offtopic, you don't know me, you don't know what I do in general at all and you are making assumptions about me and being very arrogant about it.
You don't know this forum, the people in this forum spent years together making mods for Halo 2 and know more or less each others skills.

Why the hell should I anyways be listening to some random person like you who creates an account on a random forum and replies to a one year old thread, totally off topic etc.
I clearly was not looking for a better language and stated that I don't have time to switch to another language like C++ upon your reply, some people have priorities and limited time to manage and spend on different activities, maybe you should think about that and respect it.
It just seems like you posted to pick arguments about programming languages.

The thread wasn't about picking the best language and what was being asked was already solved anyways, a year ago, therefore if you are looking for arguments about programming language choices with random strangers, this isn't the place nor what the topic was about and I am done wasting my time, how stupid of me to even reply in the first place.
Locked