Sunday, December 20, 2009

Expression Trees - serializing your data

Update: Just to clarify - the code snippets below are under the MIT/X11 license.

I spent a few hours over the weekend writing a binary serializer using expression trees. I wanted to see how things would look using the new features available in .NET 4.0. My requirements were pretty simple:

1) Serialize all public properties in a type or a subset of them
2) Control the order in which they're serialized - sometimes you need to interop with an existing and you must write your data in a specific order
3) Control how a primitive is converted - Do you need to write value types in big endian, little endian, middle endian?
4) Easy to use API.

So lets start with the API. This is what I was hoping to use:

public class Secondary
{
public int First { get; set; }
public int Second { get; set; }
public int Third { get { return First + Second; } }
}

public class MyClass
{
public byte ByteProp { get; set; }
public short ShortProp { get; set; }
public int IntProp { get; set; }
public long LongProp { get; set; }
public string StringProp { get; set; }
}

static void Main(string[] args)
{
// Register a message so that all public fields will be serialized
Message.Register<MyClass>();

// Register a message so that only some fields are serialized and
// they are serialized in the specified order
Message.Register<Secondary>(
d => d.Second,
d => d.First
);

// Create a stream to serialize the data to
Stream s = new MemoryStream();
var message = new MyClass {
IntProp = 1,
LongProp= 2,
ByteProp= 3,
ShortProp = 4,
StringProp = "Hello World"
};

// Encode the message to the stream
MessageEncoder.Encode(message, s);

// Rewind the stream and then decode the message
s.Position = 0;
var decoded = MessageDecoder.Decode<MyClass>(s);
}

It's pretty standard stuff. You can work with the standard serializer logic (serialize properties alphabetically) by registering an object without specifying any specific properties or you can customise which properties are serialized. This could also be done using attributes, but using attributes to control the order in which properties are serialized would be more error prone than the above.

Firstly, sometimes you need to write your data in big endian, others you need little endian. Sometimes you won't care. What you need is to be able to control this:
MessageEncoder.RegisterPrimitiveEncoder<int>((value, stream) => {
stream.Write(BitConverter.GetBytes(value));
});

It's simple. Any type which can be directly converted to an array of bytes is classified as a 'primitive'. Each primitive can have an encoder/decoder pair registered as above.

public static class MessageEncoder
{
static Dictionary<Type, Delegate> encoders;
static Dictionary<Type, Delegate> primitives;

static MessageEncoder()
{
encoders = new Dictionary<Type, Delegate>();
primitives = new Dictionary<Type, Delegate>();
RegisterPrimitiveEncoders();
}

static void RegisterPrimitiveEncoders()
{
RegisterPrimitiveEncoder<byte>((value, stream) =>
stream.WriteByte(value)
);

RegisterPrimitiveEncoder<short>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

RegisterPrimitiveEncoder<int>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

RegisterPrimitiveEncoder<long>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

var intWriter = (Action<int, Stream>)primitives[typeof (int)];
RegisterPrimitiveEncoder<string>((value, stream) => {
var buffer = Encoding.UTF8.GetBytes(value);
intWriter(buffer.Length, stream);
stream.Write(buffer);
});
}

public static void RegisterPrimitiveEncoder<T>(Action<T, Stream> encoder)
{
primitives [typeof (T)] = encoder;
}

public static void RegisterMessage<T>(params Expression<Func<T, object>>[] properties)
{
RegisterMessage<T>(properties.Select(p => p.AsPropertyInfo ()));
}

public static void RegisterMessage<T>(IEnumerable<PropertyInfo> properties)
{
var propertyEncoders = new List<Expression>();

// The encode function takes an instance of the class we're decoding and the Stream
// which we should write the data to.
ParameterExpression source = Expression.Parameter(typeof(T), "source_param");
ParameterExpression stream = Expression.Parameter(typeof(Stream), "stream");

// For each property, get the encoder which will convert the value of the property to a byte[]
// which can be written to the stream.
foreach (var property in properties) {
// Get the encoder for this property type
var action = primitives[property.PropertyType];
// Create a var which holds the Action <T, Stream> which encodes the data to the stream
Expression converter = Expression.Constant(action, action.GetType ());
// Invoke the encoder passing the value of the property and the 'stream'
Expression invoker = Expression.Invoke(converter, Expression.Property(source, property), stream);
// Add the encoder for this property to the list.
propertyEncoders.Add(invoker);
}

// Create an expression block which will execute each of the encoders one by one
Expression block = Expression.Block(propertyEncoders);
encoders.Add(typeof(T), Expression.Lambda<Action<T, Stream>>(
block,
source,
stream
).Compile());
}

public static void Encode<T>(T message, Stream s)
{
var encoder = (Action<T, Stream>)encoders[typeof (T)];
encoder (message, s);
}
}

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Linq.Expressions;
using System.Reflection;
using System.Net;
using System.IO;

namespace Encoder
{
public static class MessageDecoder
{
static Dictionary<Type, Delegate> decoders;
static Dictionary<Type, Delegate> primitives;

static MessageDecoder()
{
decoders = new Dictionary<Type, Delegate>();
primitives = new Dictionary<Type, Delegate>();
RegisterDefaultDecoders();
}

static void RegisterDefaultDecoders()
{
RegisterPrimitiveDecoder<byte>((s) => {
var val = s.ReadByte();
if (val == -1)
throw new EndOfStreamException();
return (byte)val;
});

RegisterPrimitiveDecoder<short>((s) => IPAddress.NetworkToHostOrder (s.ReadShort()));
RegisterPrimitiveDecoder<int>(s => IPAddress.NetworkToHostOrder (s.ReadInt()));
RegisterPrimitiveDecoder<long>(s => IPAddress.NetworkToHostOrder (s.ReadLong()));

var intDecoder = (Func<Stream, int>)primitives[typeof(int)];
RegisterPrimitiveDecoder<string>(s => {
var length = intDecoder(s);
var buffer = new byte[length];
s.Read(buffer, 0, buffer.Length);
return Encoding.UTF8.GetString(buffer);
});
}

public static void RegisterPrimitiveDecoder<T>(Func<Stream, T> decoder)
{
primitives.Add(typeof(T), decoder);
}

public static void RegisterMessage<T>(params Expression<Func<T, object>>[] properties)
{
RegisterMessage<T>(properties.Select(d => d.AsPropertyInfo()));
}

public static void RegisterMessage<T>(IEnumerable<PropertyInfo> properties)
{
var propertyDecoders = new List<Expression>();

// The decode function takes an instance of the class we're decoding and the Stream
// containing the data to decode.
ParameterExpression source = Expression.Parameter(typeof(T), "source_param");
ParameterExpression stream = Expression.Parameter(typeof(Stream), "stream");

// For each property, get the primitive decoder which will read data from the stream and
// return a value of the correct type.
foreach (var property in properties) {
var action = primitives[property.PropertyType];
// Create a var which holds the Func <Stream, T> which decodes the data from the stream
Expression decoder = Expression.Constant(action, action.GetType());
// Invoke the decoder passing 'stream' as the parameter
Expression invoker = Expression.Invoke(decoder, stream);
// Store the return value of the decoder in the property.
Expression setter = Expression.Call(source, property.GetSetMethod(), invoker);
// Add the decoder for this property to the list.
propertyDecoders.Add(setter);
}

// Create a block which will execute the decoders for all the fields one after another.
Expression block = Expression.Block(propertyDecoders);
decoders.Add (typeof (T), Expression.Lambda<Action<T, Stream>>(
block,
source,
stream
).Compile ());
}

public static T Decode<T>(Stream s) where T : class, new()
{
T t = new T();
var decoder = (Action<T, Stream>)decoders[typeof(T)];
decoder(t, s);
return t;
}
}
}


The idea is quite simple. For each class we can generate an ideal serializer using expression trees which doesn't require boxing or casting. This way we can avoid the use of reflection when serializing objects and so avoid the performance penalties incurred that. The code above only handles the simple case where a class consists of primitive types (int, long, string) , though it'd be easy enough to extend it to support more complex scenarios.

The serializer as you see it could not have been written with .NET 3.0. Some of the key components like BlockExpression were only introduced with .NET 4.0. If your object contains an array which needs to be serialized, you'll need the new IndexExpression too. Sure, it's possible to fake these using some anonymous delegates and Actions, but that's not pretty :)

The total implementation is less than 170 LOC. I'd be willing to bet that with another 100 LOC you could support most constructs. If you're currently a heavy user of reflection to provide object serialization, it's time to update ;)

10 comments:

Unknown said...

> If you're currently a heavy user of to provide object serialization, it's time to update ;)

I think you missed something here.

WorldMaker said...

This is also really cool, Alan. Would you mind explicitly declaring a license (add useful comments to the top of the file blocks) for this and the ChangeNotifier classes? (Ms-PL would certainly be awesome, if you don't have a particular license in mind.)

Also, maybe its time to collect these into one or more source code repositories and post them to Bitbucket or Github or Launchpad or somewhere.

Alan said...

@Don: *doh*. Blogger has developed an annoying bug where it deletes two words instead of one when you're editing. I catch it most of the time, but obviously not there ;) I meant to have the word "reflection" in there. Post updated.

@WorldMaker: I updated to explicitly put it under the MIT/X11 license. The idea of putting them in a VCS is nice alright. I may end up doing that. But then I'd have to keep coming up with useful snippets or the repository would get lonely ;)

Jonathan Pryor said...

@Alan: Throw your useful snippets into Cadenza. It's a snippet repository. :-)

http://gitorious.org/cadenza

Chat with us on ##csharp at irc.freenode.org.

aliyaa said...

Data Stirling is now becomes possible. It is linked with statistical facts. To do my statistics homework for me would be great advantage for me.

梁爵 said...

2019.10.10台北酒店經紀知名公司宣布,創新酒店上班的酒店小姐與外送到府服務之外,將推出全新酒店打工的「賺錢現領」(Uber Works)的全新應用程式(App),此新項目即日起在中國市場實驗一年,主要替公司企業老闆尋找合適陪睡(性交易)的酒店小姐。酒店經紀指出,想要透過這項新項目找到酒店兼差工作者,需先通過Uber以及酒店經紀公司進行的背景調查,知名酒店經紀公司梁曉尊執行長說,在調查過程中,Uber會詢問酒店兼職應徵者是否具有與該職位相關配合度,如:工作性質本能、口交、口爆(射在嘴巴裡) 、顏射(精子射在臉上) 、開後門(肛交) 、制服(角色扮演)。此外慎重嚴禁不得內射(精子射在裡面),基於健康安全及職業道德。

梁爵 said...

2019.10.17台北知名酒店經紀可靠依據大爆料:接獲翁子涵的跨國酒店工作性交易(賣淫)案,指她9年前未成年時就開始接客,當時她處女膜未突破,賣相很好的極品。知名酒店經紀透露,當時透過管道持有假護照、假證件的出國酒店打工翁子涵處女膜未(開苞)開出第一次性交易金額高達27萬元的價碼,非常受歡迎各大企業名人老闆現場開價,最終以新台幣27萬元成交。後續翁子涵變身通告藝人以及展場女郎,慢慢轉型酒店兼差私下自己接S陪睡(性交易)1晚要價新台幣10萬元,只是2015年她被警方跟監逮捕,法院判決「圖利媒介性交罪」,分布在當年3月、5月與9月:來自澳門、台中的知名人物每次支付仲介新台幣14萬至20萬元不等,翁子涵則分得10至15萬元。酒店經紀透露,2018年翁子涵又再度下海在國外酒店兼職外傳金額談得好可以而外售後服務,如:顏射(精子射在臉上) 、口交口爆(精子射在嘴裡) 、戴套肛交進行交易,可見圈內是有知名度,怕在台灣賣淫會被爆料。翁子涵的心歷過程,某種程度也還算勵志。

梁爵 said...

2019.10.22台北市酒店上班為迎接年底大量酒店小姐酒店工作,知名酒店經紀梁小尊/梁小尊執掌上市股份國際娛樂經紀公司將再台北設立就業服務處參與大學生酒店打工現場徵才活動,於10月21日至10月25日邀請20家八大行業酒店業者參與,有酒店兼差公開平台、便服店、禮服店、制服店、飯局、伴遊、國際跨國援交(陪睡性交易)等行業,提供多達360個工作機會,職缺多元且有部份職缺是可於2020年後上班,歡迎有興趣的求職上班族有興趣酒店兼職或是想轉職先行卡位,把握現場應徵的機會,早日找到未來理想的新生活。

梁爵 said...

2019.11.20台北市知名酒店經紀內幕指出:王姓女子與李姓女子以「在酒店上班賺錢最快,也可以獲取高薪」宣傳酒店打工手法,引誘3名未滿18歲少女加入旗下擔任酒店小姐,還提供住宿並帶少女到「造型概念館」挑選禮服及安排專人化妝後 ,再以她們需酒店上班賺錢償還妝髮、禮服、包鞋等治裝費用為由,帶往酒店工作陪酒從中牟利。檢調調查,39歲的王女從事媒介旗下小姐至酒店坐檯陪酒,以抽取傭金的酒店經紀人工作,明知不得招募大學生酒店兼職、引誘、容留、 媒介、協助、利用或以他法使少女坐檯陪酒或涉及色情之伴遊、伴唱、伴舞、顏射(精子射在臉上) 、口交口爆(精子射在嘴裡) 、戴套肛交進行交易專業精緻化等行為,竟意圖營利,先以無償提供套房住宿、替你買單支付名牌物品及餐費等條件作為報酬,聘僱具相同犯意的24歲李女(綽號「達達」、「金達達」、「卡比」)擔任其助理,負責招募上班族酒店兼職來當酒店小姐、代向各酒店收取旗下小姐坐檯陪酒薪水、擔任司機接送小姐上下班。期間少女每坐檯陪酒1檯1小時,由店家支付850元或900元不等薪資,少女1檯1小時皆實領600元,其餘均為經紀人所得傭金,酒店則另向男客收取包廂費及酒水等費用。檢方指出,王女、李女所為,均係犯違反兒童及少年性剝削防制條例之意圖營利引誘、媒介、協助少女坐檯陪酒罪嫌,被告媒介少女坐檯陪酒利潤為每人每檯250元、每人每天保守以5檯計算約1250元,王女媒介少女陪酒日數約40天,經估算後犯罪所得共計約5萬元,建請院方依法宣告沒收。

梁爵 said...

2020.05.22酒店小姐的基本介紹跟工作內容公關、陪侍乃至於性工作者通常不是酒店、舞廳、包廂KTV或養生館的直屬員工,他們我在酒店上班的日子在經紀人的帶領下,前往與經紀公司合作的酒店打工地點上班,比起雇傭更貼近承攬,沒店可以上工時就直接放無薪假,在政府勞動統計上他們等於是失業人口,申請失業救濟金是天方夜譚。酒店並無底薪保障,公關酒店上班-酒店兼職-兼差如何達成人生的第一桶金要客人點選坐檯才有薪水,而且店家會在發薪時扣除稅金,卻不會幫忙投保勞健保。依照 八大行業是哪八種行業呢?業績高低,酒店從公關的薪資中扣除400元到1,000元不等的營業稅,檯費也不是公關全部收進口袋裡,必須與酒店和趴客幹部、經紀人分潤,以350元的節薪而言,常見酒店兼差不是一個複雜的工作環境? 的配比是酒店與趴客幹部150元、小姐150元、經紀人50元,這個配比會依照小姐的外在條件、與經紀人和經紀公司的協議、是否有債務和借貸關係而有調整。錢的問題一向複雜,八大行業金流不患寡而患不均,事情每經過一個人的手,就必須給人抽成,如何打點每個關卡是極為細膩的功夫。客人的帳單上看不到公關們被扣款的部分,也看不到勞動權益被忽略的細節,更不要提酒店內物價奇幻,小姐們別在身上的名牌,一個要收500到1,000元的製作費,在公司休息室租用儲物櫃,也是以每星期千元為計價單位,小姐向公司團購質料粗劣的應景服飾,價格是網拍的十倍,並且每個星期收送洗費上千元,化妝、髮型與服裝不符合規定一項扣500元以上,店家也有各自的內規,例如遲到一分鐘扣50元,開會不到扣3,000元,遲到兩小時就要自己大框自己,也就是買下一整天60節的上班時間,立刻失血21,000元--要在這些條條框框後賺到錢,沒兩把刷子真難,八大也分三六九等,一張酒店收據不代表整個產業的全貌。

Hit Counter