Google's Protocol buffers
Protocol buffers are language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols and data storage. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs that are compiled against the "old" format.

https://developers.google.com/protocol-buffers/docs/overview

Initial design for protocol buffers is aimed at Java, C++ and Python developers. Marc Gravell, a developer for Stack Exchange/Stack Overflow has designed and implemented protocol buffers in C#.
http://code.google.com/p/protobuf-net/

C# classes
Below is an example of a class definition using C#.

    [Serializable]
    public class Person {
        public int Id {get;set;}
        public string Name { get; set; }
        public Address Address {get;set;}
    }

    [Serializable]
    public class Address {
        public string Line1 {get;set;}
        public string Line2 {get;set;}
    }

When data is serialized for the above classes, the serialized data looks as below

Person Class Person
Address ClassAddress

 

 

Protocol Buffer classes in C# - http://code.google.com/p/protobuf-net/wiki/GettingStarted

Below is an example of a class definition that is compliant with Protocol Buffers in C#.

    [ProtoContract]
    public class ProtobufAddress
    {
        [ProtoMember(1)]
        public string Line1 { get; set; }
        [ProtoMember(2)]
        public string Line2 { get; set; }
    }

    [ProtoContract]
    public class ProtobufPerson
    {
        [ProtoMember(1)]
        public int Id { get; set; }

        [ProtoMember(2)]
        public string Name { get; set; }

        [ProtoMember(3)]
        public ProtobufAddress Address { get; set; }
    }

Class marked as a ProtoContract ensures that class is serializable. Properties marked as ProtoMember ensure that these properties get serialized using Google protocol buffer standards.

Unlike .NET serialization, members/properties within a class are not encoded into the steam. Each member in the class must be specifically identified using a positive integer.

Notes for Identifiers

  • Members must be Positive Integers
  • Identfiers must be unique within a single type; Same numbers can be re-used in sub-types if inheritance is enabled
  • Identifiers must not conflict with any inheritance identifiers
  • Defining a member with a lower number ensures that serialized object takes less space
  • Identifier is important; A member-name can be changed or shifted between a property and a field. Changing the identifier changes data. Any data that has been previously serialized and written onto a persistence mechanism cannot be read using the class. Possible workarounds include, and not limited to
    • Creating a new class by inheriting, previously defined class

Notes for types

  • Supports custom classes that:
    • Are marked as data-contract
    • Have a parameterless constructor
    • For Silverlight: are public
  • Many common primitives etc
  • single dimension arrays: <tt>T[]</tt>
  • <tt>List<T></tt> / <tt>IList<T></tt>
  • <tt>Dictionary<TKey,TValue></tt> / <tt>IDictionary<TKey,TValue></tt>
  • Any type which implements <tt>IEnumerable<T></tt> and has an <tt>Add(T)</tt> method

 

Data Serialization and deserialization

Protocol buffers is centered around binary formatting of data and is heavily based around Stream class. Protobuf-net provides classes for serializing and deserializing data.

When data is serialized using Google Protocol buffers, the serialized data looks as

Person Class

PersonProtobuf

 

 

Address Class

AddressProtobuf

 

 

 

Output Comparison

Output

 

 

 

 

 

 

 

 

Large Datasets
Protocol Buffers are not designed to handle large messages. If you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy. Protocol Buffers are great for handling individual messages within a large data set. Usually, large data sets are really just a collection of small pieces, where each small piece may be a structured piece of data.

Even though Protocol Buffers cannot handle the entire set at once, using Protocol Buffers to encode each piece greatly simplifies your problem: now all you need is to handle a set of byte strings rather than a set of structures.

Protocol Buffers do not include any built-in support for large data sets because different situations call for different solutions. Sometimes a simple list of records will do while other times you may want something more like a database. Each solution should be developed as a separate library, so that only those who need it need to pay the costs.
https://developers.google.com/protocol-buffers/docs/techniques

Getting Started
Getting Started wiki @ http://code.google.com/p/protobuf-net/wiki/GettingStarted provides a detailed understanding on getting to use protocol buffers, in your application.

The application uses the library created by Marc Gravell and demonstrates using the library in ASP.NET MVC application. The example uses C# for implementing protocol buffers.


Last edited Dec 7, 2012 at 7:52 AM by raghupenumarthi, version 14