Separate Binary Data Out Of An Entity For Effective Loading

I received an email this morning asking about loading binary data related to an entity, here is the excerpt:


I was reading the following post on your blog and checking the code in Google Code. Entity Framework 4 POCO, Repository and Specification Pattern [Upgraded to EF 4.1]

I found it really interesting and realized that the performance issues where solved.

But how do you handle binary data?

Consider you have an entity with 2 Byte[] fields (varbinary or filestream in the database).

public class Product
    public Int32 Id { get; set; }

    public Boolean Available { get; set; }

    public String Name { get; set; }

    public Byte[] Image { get; set; }

    public Byte[] Brochure { get; set; }

When N records are loaded you will get N * 2 files being loaded even if you want to display only the names.

One solution to this problem is to apply a projection. For example:

context.Products.Where(x => x.Available).Select(x => new ProductView { Available = x.Available, Name = x.Name, Image = x.Image });

In this case I am leaving out the brochure but including the image … How do you address this with your code?”


Here is my answer:

In my experience regarding the binary data, I would create another entity, says, BinaryData which is somewhat like below (of course there will be a corresponding BinaryData table in the database):
public class BinaryData
    public Int32 Id { get; set; }
    public Byte[] Data { get; set; }
Then within an entity that has binary data, like your Product with Image or Brochure properties, I just add dependencies (or relationship) to the BinaryData entity:
public class Product
    // ... other properties
    public BinaryData Image { get; set; }
    public BinaryData Brochure { get; set; }

With lazy-loading enabled, you won’t worry about loading binary data (Image, Brochure) when loading Product, but it will be loaded when you access to it like: product.Image”

I myself think it is a good design practice when working with ORM framework, like Entity Framework or NHibernate.

What do you think? If there is a better solution for this, please drop a comment.

Posted on July 8, 2011, in Entity Framework, ORM, Tip & Trick and tagged . Bookmark the permalink. 3 Comments.

  1. Excellent tips, a.Huy!

  2. Here is another option using 3 tables (Products, ProductsFiles, Files).

    create table dbo.Products
    Id int identity not null
    constraint Products_Id_PK primary key clustered (Id),
    CategoryId int not null,
    Approved bit not null
    constraint Products_Approved_DF default (0),
    Name nvarchar(400) null

    Note 1:
    The filestream fields were removed from Products.

    Note 2:
    I added a FK CategoryId related to a table Categories.
    The objective is to provide an example with more tables.

    Now the changes:

    create table dbo.Files
    Id int identity not null
    constraint Files_Id_PK primary key clustered (Id),
    Data varbinary (max) filestream null
    constraint Files_Data_DF default (0x),
    [Key] uniqueidentifier rowguidcol not null
    constraint Files_Key_U unique,
    Flag nvarchar(40) null
    ) filestream_on [STORAGE]

    The table files would hold all files from all tables of the database.

    The flag column, in a file associated to a Product, can be “Image” or “Brochure”.

    The two tables are related as follows:

    create table dbo.ProductsFiles
    FileId int not null,
    ProductId int not null,
    constraint ProductsFiles_ProductId_FileId_PK primary key clustered (ProductId, FileId)

    For other tables with filestream, Posts, Books, etc you would have PostsFiles, BooksFiles, etc

    Now the EF part. Consider loading a list of products with:

    1 – LazyLoading = OFF:
    You would get only the Product table. No binary data.
    But you want get the Category either.

    2 – LazyLoading = OFF and Include=”Category”.
    You would get the Category and not the Files

    3 – LazyLoading = ON;
    You would get the Category and Files

    4 – LazyLoading = ON and with a Custom Parameter: Exclude (Brochure)
    On your repository you would load Category and Files.
    But would do a projection in the database to not load the file Brochure.

    In the future this might be replaced by EF lazy loading on column.

    You can also check if there is a brochure for Product X but not load it.
    Just check if there is a association to a file with Flag brochure.

    This approach can be associated to other scenarios.

    Imagine that on a web site you need to make an isolated image available for change.

    You could add two more tables associated to Files:

    create table dbo.FilesProperties
    FileId int not null,
    PropertyId int not null,
    constraint FilesProperties_FileId_PropertyId_PK primary key clustered (FileId, PropertyId),
    Value nvarchar(20) null

    create table dbo.Properties
    Id int identity not null
    constraint Products_Id_PK primary key clustered (Id),
    Type nvarchar(400) null

    Properties could be something like:
    MimeType, Width, Height, MaximumSize, etc

    Now you could associate properties to files. Something like:

    MIME = “image/jpeg” or MaximumSize=”40KB”.

    So this file system can not only be used to save files from other tables but also isolated content which always exists but needs restrictions.

    This is just a small example of how to extend the system …

    And if EF in the future adds lazy loading for columns this approach is still fine.

    There would be only a few changes on the C# code.

    Any comment?

  3. Good work around shaper. If you prove your code it would enlight more

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: