Developing Business Connectivity Services (BCS) with Sharepoint 2010 and visual Studio

Hi all guys,

yes, it’s been quite a lot since i wrote something on this blog. I’ve been quite busy with lots of stuff, business and personal, so i didn’t took the right attention in posting news.

Anyway, here we are.

The topic I am covering today is something that is quite well treated on the internet, so i won’t go deep in details. I thought to provide some hints, and useful tips when you need to set up some BCS and you never did before.

First thing: set up your BCS

As i said, i don’t think it could be of much contribute to write down again all the stuff needed to create a BCS. To be complete, here are some excellent links to get the things right:

IMHO, the third is the most important of these links, because it is explained very well what you have to do when creating a BCS for Sharepoint.

So, you have followed all these links, and now you think you can rule sharepoint BCS  uh? Here are some hints that maybe useful when not everything goes as is supposed to go.

What the heck is the Entity, and how is it bound with my data?

From what i have understood until now (and i could be greatly wrong), what is called Entity is nothing more than a proxy used by sharepoint to retrieve your data. For coherence, the entity reflects your business data (also known as Line Of Business data, LOB), but not completely. This is what will be published under sharepoint, and will contain all the relevant information in order to specify your real data’s identifiers (read as: connection to real fields on your “table”), and methods (read as: what sharepoint will call once it needs to interact with your “table”). And, not to forget, this Proxyy, will deal with objects that ARE your LOB system. Therefore, the class you have to set under, let’s say, the Read Item method as return type, is the one you use to get the data (aka: if you’re using Linq2Sql is the actual class you create in the .dbml file).

The real binding, instead, is made in another part. You can get access through the BDC Explorer (that will soon became your best friend). There, you set the real thing.

In the BDC you’ll find again your newly created Entities, and in every single method you’ll create for that particular Entity you can define the data you’re dealing with. TIP: keep as a good practice to create always the specific finder first. Why? Because the specific finder lets you define the data your Entity will deal with, and once you create the other CRUD methods, VS will know how your data is structured.

Do I have to create an Identifier for each property I have in my LOB?

The answer is… depends :-). Vague answer i know, but it really depends on how you need to expose your data. Just keep in mind one thing: for each identifier you create, you have to handle that specific identifier in your search methods.

An example? You have a Customer Entity, with 7 properties. Let’s say you create 7 Identifiers. The signature of the ReadItem method will have to have 7 arguments, one per identifier. Otherwise, if you try to call the method SP will tell you that you can’t call a method that expects 7 identifiers. Therefore, if you need to make queries only by Id, create  only ONE identifier.

You have created your BCS, but the Line Of Business object (LOB) doesn’t have the exactly same properties of your BCS Entity

Strange as it might be, all the guys writing down these super tutorials always write this stuff as if you will never use them in real life. Therefore, they put down the perfect case ever, where everything is pure and perfect. Well, my case wasn’t. It was like probably every real life project, where everything is a bit messed up, and is not THAT linear.

So, how do u deal with LOB systems that differ from Entities? You just have to use a wonderful property called LOB Name. This provides the underlying mapping between your entity and the LOB. Assuming you are using an ORM like Linq2Sql, it has to be the property’s name of the class reflecting your sql data.

HELP! I keep updating the code, but on the bloody server I get always the same errors

This is quite easy, indeed… check whether the bloody visual studio didn’t defined a static versioning of your dll… like Be sure it has at least a * in the versioning number. In this way, sharepoint will understand that your assembly has changed at every build, and will replace the whole package.

Ok, how do I check if the methods i am creating are good ones or not, WITHOUT using sharepoint?

Another simple but annoying fact. Even if sharepoint is a great tool, is made for users. Not for developers. At least, it doesn’t give any real feedback in the page if something goes wrong.
So how can you deal with BCS? Fortunately, a nice company have developed a very useful tool: BCS Tester Man. Is extremely simple, ready to use, and throws out the real s#@t that SP doesn’t show when you make some mistake.

You made it. It works, at least while testing. Why the heck I can’t see my list?!?!?!?!

This one made me struggle for a good couple of hours. I really couldn’t understand what was wrong, even if i knew it was something very simple (as it always is). I made the BCS correctly, the mapping was right, the data provider (lob) was right too, with BCS Tester it worked correctly (list retrieval ok, specific ok). But after the deploying the BCS to sharepoint, and creating a list upon it, I had a very strange situation.
I could see the actual insert form, but in the list it showed me the generic error:
“Unable to display this Web Part. To troubleshoot the problem, open this Web page in a Microsoft SharePoint Foundation-compatible HTML editor such as Microsoft SharePoint Designer. If the problem persists, contact your Web server administrator”
And THIS is annoying. If you search in the internet for such an error, you can find almost everything. They all come down to a point, though: permissions. And it is bloody right also in this case. I have noticed basically two cases: the first one, where you didn’t set the right permission level on your BCS (check Central Admin -> Manage Service Application -> Business Connectivity Service -> Set Permissions)… or the one i dealt with: if you work with a third party database you need to grant the right permission level so that the user that will try to access to the database won’t be rejected.
In my case, the connection string used to connect to the database was using windows authentication. Therefore, i had to give read/write permission to the user that executes the application pool of my sharepoint website. After that… ta daaaaa! Match won! 😉


Well, this is all for now my dear readers. I just dealt with this kind of development, and i felt like useful if there would’ve been out there a post like this. with useful hints that are not very often posted on blogs.

As always, feel free to comment, reply, and all the usual stuff.





Extending Team Foundation Server – Part 1

Hi everybody, as promised some time ago, here  we are.

This is the first of a series of articles that will cover Microsoft Team Foundation Server. As obvious, this has not the intention to be a guide to the product. There are lots of books and references online, so if you want to dig into the topic, you can start at least from here. The intention of these posts is to share my experience, and at the same time provide some useful tip i have found in managing / extending TFS. The reason i chosen this title is because, eventually, the focus of the articles is how TFS can be extended in order to work with its APIs, and let you interact in the automated build process.

A short introduction

For the (few I think) ones who don’t know nothing at all about TFS, here is a brief explanation. First of all TFS is not the new version of Source Safe (as the guys from MS keep repeating), is something more (or I could say much more). With TFS you can (basically):

  • Track your projects;
  • Publish the management process via a SharePoint instance;
  • Analyze and review the management process;
  • Manage source code;
  • Create an automated build system;

    Main differences from VSS and TFS, on the source code management, are:

    Feature VSS TFS
    Source Code Repository Is based on a file system structure. Based on a Sql Server 2008 Database
    Architecture VSS is basically a network share, nothing more nothing less. TFS is a real client/server environment
    Backup/restore Have you ever tried that? is a pain in the butt…  unless you can make it work with an xcopy from a previous “backup” Being based on a Sql Server Database, you can add source code in your usual backup plan. Simple and straight-forward
    Transactions Because of its nature, no transactions are supported Native support (takes advantage of Sql Server transactions). Any code change is committed as a single transaction once checked in (the so called Changeset), therefore in case of errors, your changes are rolled back in a secure way
    Security User management is very simple, you can change everything or you can’t. Permissions are based on a concept quite similar to a file system, so you can tune your security policies in order to allow or deny access to each single project. Oh, and it is off course integrated in Active Directory (stop to several accounts for a single user).
    Scalability VSS is suitable for small teams (20 or less users recommended), and codebase size is limited to a maximum of 4Gb Can support teams up to 2000 users, and codebase size is limited to the size of a Sql Server database.
    Code management Vss is basically a “list of folders”, where you put your code. if you want to divide your codebase in areas, you have to create several “instances” of the database, so you can keep them separate. TFS is based on the concept of Team Project Collections. Each of them can reside on the same database server, and in different catalogs, but is treated in a separate shell. So you can isolate your codebase in a proper way.

    and this is just a glimpse. for a better understanding, you can refer here (a sort of VSS vs. TFS 2005), or here.

    Why did I get here

    I have recently installed the new version of Microsoft Team Foundation Server 2010 in my company. And with this new add i am planning to restructure the way source code is managed in our Development cycle.

    I work in a software house, so we have several projects running all together in the mainstream of application development. And like any software-house, we have also our internal set of libraries were we keep our “secret magic”.

    So, now that we are under restructuring, i have decided to keep things in a good order. The way it was managed before, was something i used to call: the easy but messy way. Substantially there were a couple of instances of MS Source Safe databases, and all the projects mixed together. Lots of users (some of them lost or abandoned in the waves of time), no integration, very inefficient this due to the limitations of Source Safe itself, and no concurrent development features. Also, code tracking is quite cumbersome, because there are no real tools to keep track of changes in an efficient and, most of all, EASY way.

    So I created a virtual machine with Windows Server 2008, and I have mounted on an instance of Team Foundation Server 2010. Because we have a pre-existing infrastructure, I have chosen not to make a stand-alone installation, but in case you have a brand new environment during install you can decide to install everything on a single host. It will take care of all the stuff, database (an Express version, sorry) SharePoint site, and off course the server stuff.

    I am not going through the installation process and options, as above you can find lots of posts talking about it, and after all is not that difficult to install.

    Once I had installed the whole thing, I have started to plan how to put the code inside Sorriso.

    I have decided to stay stick to a concept that could help keep things in order. Basically, we have two main areas of development: our internal framework / libraries, and the projects we deliver to the customers. Therefore, I have created two separate Team Project Collections: one named “Common libraries” where all the framework resides, and one named “Mainstream”, for the “real deal”.

    But after all the mess I have planned and started to do, I still had a couple of questions buzzing in my mind:

  • Fine, I have my source code that is separated, I can assign permissions to different teams, but how can I let the two libraries communicate one with the other?
  • And most important: how can I make it simple for my dear fellas at work?

    For the answer you will have to wait for the next posts, because this is right the topic I am going to cover Occhiolino.

    Stay tuned!

    Note for the readers: I know, probably some of you will think “oh he missed this feature!” or “it’s not really like that”. Please, keep in mind one thing: what I am writing here don’t want to be a Bible, and I am not going to dissect every single aspect of TFS. The intention is just to share my experience, and give some general advice. And in any case, if you want to comment you all are welcome!

  • upcoming posts

    hi guys! i know i’ve been away for a while from this blog, but i have been quite busy lately. just wanted to drop a note to inform my kind readers that in these next days i am going to post some new articles, on two major topics:
    – Extending Team Foundation Server programmatically;
    – Sharepoint 2010 Development / Customization;
    I’ve been struggling on these two topics quite a lot lately, and i have now gathered enough infos to write down something quite straightforward.
    So keep in touch and stay tuned!

    Automatically add TFS Builds to Source Control

    Hi all, i was looking in these days over the web some information to solve a problem i’m having from some time now.
    I have implemented in my company a TFS Server to manage my source code. Implementation is quite straightforward, a Team Project Collection for company’s libraries, and another one to manage customer’s projects.
    What i want to do is to create a mechanism that allows me to add to source control the binaries generated from the Library’s builds, directly into a folder in the “Customer Projects” collection instead of doing it manually everytime i run a build.
    This post seems really to do the job 🙂

    Sometimes PC Vendor are not THAT clever…

    Hi all, today I want to tell you about something very very weird that happened at work.

    A colleague just received a new laptop, and the thing had to be prepared for letting him do his work properly. This means, through the other things, to install Sql Server Management Studio so he can connect to databases and do all the stuff he need to be done.

    So (guess who won the annoyance?) that meant I had to prepare the laptop. Aside all the normal stuff like joining the workstation to the local domain, install tools and so on, I mounted our Sql Server 2008R2 Image in order to install the tools.

    What happened from there has been really really weird. The workstation is a HP Probook 4530S, with Windows 7 64 Bit Pro preinstalled. SSMS installation went quite straightforward, but at the end of the install process a message popped out saying something like “Unable to find that Type Library. Try reopen the program with an administrative account to fix the problem”. The message sounded quite awkward, for  2 reasons:

    1) I was using the administrative account to set up the laptop;

    2) fresh OS install… are you kidding me?

    Anyway, I banged my head over and over this bloody issue, no solution could be found. Tried with SubInAcl and a script in order to reset permission on the registry, uninstall and reinstall something like a million time the bloody software, applied SP1 and SP whatever, nothing to do. So I just headed to the final possible solution I had in mind in order to get this problem off… because it is a pre-installed pc I just wanted to use the system restore from HP.

    It’s somehow amazing how a bad day can become the worst day of the week in a matter of seconds…

    Let’s get back to the point. Straight system restart, keep pressed F11 to launch the system restore… bang! A sudden message appears:

    “No image can be found in the system. Insert a disk containing a valid media”.

    The message is not exactly the one I quoted, but that is the meaning. There was no way the system could find its OWN RESTORE PARTITION. That is designed RIGHT FOR THIS PURPOSE. To RESTORE.

    You can’t even start to figure out the appellation that came to my mind in that moment (I have a very strong imagination Sorriso), anyway I just opened the browser, and started to search for a solution. Lucky me (at least a good one in a crappy day), I found a post on the HP site you can find here where somebody else had the same problem.

    Quoting the guy that identified the issue:

    “If your disk type is set to dynamic,then that is where you have the problem of using the f11 to recover your computer.f11 will never work if you have your disk type set to dynamic.”

    And this is the amazing stupidity of this issue… looks like the guys from HP released a system image with dynamic disks, without noticing that in this case their own restore couldn’t work. My simple question is: do they have someone who tried it?

    Anyway, the solution is quite straight-forward: you just have to download a partition editor software, in order to change the disk partition from dynamic to basic. The software name is MiniTool Partition Wizard Professional v6, and you can find it at A freeware version is available, don’t even get close to that because it can’t manage dynamic partitions (and I am beginning to understand why Occhiolino). Once you obtained the software just install it, and the only thing you can do for first is to convert the partitions from dynamic to basic. This takes some minute, and it changes the whole disk, not just one partition (so don’t panic if it asks to convert everything and no partitions are displayed at first. They still are there is just the software cannot read it properly). Off course a couple of restart are required, after that you’ll have your disk set back to basic. A final restart pressing F11 and the magic happens. The partition is recognized and you can finally restore your laptop.

    I can’t tell you right now how it ended with SSMS, because I am restoring the bloody laptop while I am writing the post, but will keep you updated.

    INSERT Million Rows in a Sql Server table quickly

    Hi all, as I said in these past days, I am currently busy with a very big data import. Talking about Very Large Tables, with numbers in the range of 50 million rows in one table.

    In my last topic I discussed about deleting million of rows from a table, and when you deal with such amount of data it can just happen that by accident you push the wrong data in a table. If you don’t have a quick way to get rid of the mess, you just have to deal with VERY long deletes, and that honestly sucks… especially if you have a customer on your back poking at you every ten minutes asking if the porting is going well or not, and if he or she can see the numbers at the bottom line Sorriso.

    In this topic I want to discuss about another issue that comes with import… the INSERT process!

    First of all, a brief overview of how I designed the import. I have several CSV files that are sent to a frontier area, used by my application and third party applications that basically are my data providers. What I do with this file is:

    • import each file in the database, as a separate table;
    • run some preliminary validation on the imported data, just to be sure to not bring in some junk data;
    • consolidate all the tables in a staging area (read as: a single table), so that the customer can “play” with the totals, do some summarization and so on;
    • run some other formal validation, in order to check the business logics integrity among the dataset;
    • finalize the staging area, by importing them in what I called “production” environment.

    Reason of the third step is because I need to make some merge with existing data, create new key values, blabla, yadda yadda. And most important of all, all the business logic are based on the “production” area.

    So what I am basically doing here is, aside some validation, a big copy & paste by inserting rows from a table to another this on certain rules.

    And here it comes the tough part… how can you insert 40 million rows in a shot, in a very quick and efficient way? As I did in my previous post, I will expose two methods so that you can have a perspective on the problem.

    First Option – Insert straight away

    The first and simplest option is to do a massive insert from the source table to the destination table. It does what it does, but there are a couple of things to keep in mind when adopting this practice:

    • For sure is the safest way to achieve the result, if you have time;
    • For sure the database integrity is maintained in case something goes wrong. But, again, you need time if something goes wrong and the transaction has to rollback (and I mean LONG time, it could go on for DAYS);
    • You need to have some space on disk, I would suggest at least 150Gb of free space at minimum. Otherwise you will end up with filling up the space disk, and then the database will tell you something like “Oh boy, something went wrong because my transaction log is full… what about a nice rollback of EONS?”… so keep in mind the disk space factor when you deal with large data;

    Aside these downsides, this option is ALWAYS the BEST RECOMMENDEND ONE.

    Also, to trying reduce time on Inserts you can disable indexes on the destination table (it really helps). But then remember to rebuild them, or your database will be sluggish at best Sorriso.

    And, to reduce some space on the log file you can always alter the recovery mode from FULL to SIMPLE, and save something.

    But what if you don’t have time, you are on the deadline and you need to bring home the results?

    Second Option – Let’s bring ‘em all!

    I discovered this option in these past two days, when I was troubling with (what a coincidence) I was banging my head with deadlines, customer, failed import, and so on. I needed something that could let me put everything in (I had all my data validated, with the right key values, and so on) without bothering me about disk space and time.

    So after some googling around I tried to go down the path of BCP, by basically doing an export / import from the staging area to the production. This because I don’t know why but I have always thought that you can’t use OpenQuery to insert data from a table to another table… but off course you can Sorriso. You just have to use your own database as an external data source! This might sound quite simple and straight-forward for all the database gurus out there, but I have searched for quite a time and I couldn’t find anyone suggesting this.

    So here the snippet of code that probably someone of you is desperately trying to find if is in my same situation when I wrote this:

    INSERT INTO DestinationTable(destField1, destField2, destField3, importFlag)
    SELECT * FROM OPENROWSET( 'SQLNCLI', 'Server=NameOfMyServer;Trusted_Connection=yes;',
    'SELECT field1, field2, field3, 1 as importFlag
        FROM MyDatabse.MySchema.MyTable source WHERE  source.myCriteria = 1')

    This basically is the core of what you have to do. Let’s assume you have a source table with 3 fields, and you need to transfer all the data in it into a destination table. My approach is what follows:

    • on DestinationTable you add a BIT field, in order to keep track of the data you are importing (in case of something goes wrong you can simulate a rollback by deleting all rows with isImporting = 0)
    • the query you will use in the OpenRowset clause will have an additional value set to 1, and will fill the IsImporting field we talked about above.

    a note to keep in mind: OpenQuery needs a server name (guess what? you can use @@SERVERNAME if you want to make the script dynamic) and each table must be fully qualified, this means Catalog.Schema.TableName . I have used the trusted connection flag, so I am sure I have all the grants I need to run the query.

    The results I had from this approach are simply astonishing. If I have to insert, let’s say, 5 million rows with a standard insert it takes let’s say 30 minutes. With OpenQuery, we are in the range of 5. And no transaction log if you switch the recovery mode to BULK_LOGGED.

    I will say it again: this option has to be used just in case you are in an emergency, of you are 110% sure that nothing will happen if something goes wrong. And even in the latter case, if you have time go for the first. It is better and safer.

    But as I like to say, it’s always better to know how to break the rules to respect them better. And why you have to Sorriso.

    Final considerations: off course the scenario I presented here is not exactly the real thing, but it gives you a good picture. What I do in the real environment is also keep track of each file imported and so on, so in the real environment all these things I am explaining here are a bit more complicated. Just to let you know, I have tested the same thing also in Dynamic Sql, and it works great. So good luck and have fun!


    Have you ever tried to delete million rows from a T-Sql Table?

    Hi all, in this period I am troubling with Microsoft Sql Server. Especially with data import (ouch!). And i am not talking about some thousand or hundred of thousand rows, but of Very Large Tables, in a range between 10 to 50 million rows.

    When you have to do with such big sets of rows, you have to be very careful with the operations you are performing, otherwise it can be a real pain in the b…

    The topic i am focusing lately is on a deletion of most of a table. This because some import process have failed, so i have to “rollback” the import to a certain point.

    So, here’s the task:

    I have a table with 50 million rows, and have to delete 40 million of them. How can i do it?

    Well, substantially we have 3 options:

    First Option: Let’s Make it straight!

    First and for sure the first that comes to mind is to make a delete on the table as is, something like:

    DECLARE @myCriteria INT 
    SELECT @myCriteria  = 1
    DELETE FROM myTable WHERE myField = @myCriteria

    basically, what i mean here is a Delete straight on the table, in order to delete the whole chunk of data i don’t need. It does what it is supposed to do. The only problem here is:

    • you need a LOT of time;
    • you need a LOT of hard disk space.

    Lot of time because it will take a very long time to perform a delete on 50mln rows.

    Log of hard disk space because you don’t have to forget our best friend / enemy that is called Transaction Log. This tricky dude keeps track of almost single delete, in order to restore if something wrong happens. Useful, but really painful because it will take Gb of stuff you don’t really need (if everything goes ok off course). Maybe some of you can argue on this, or appoint that setting the recovery mode to Simple can help. True, but not THAT efficient. Risk of run out of space is always there.

    Second Option: Still straight, but not the whole chunk!

    The second thing that can come across mind is to make the same thing, but split the delete in several chunks. Let’s say chunks of 1 million rows at a time.

    To do this, you can script something like this:

    DECLARE @count INT ,
        @currentId INT ,
        @myCriteria INT 
    SET @myCriteria = 1  
    SELECT  @count = COUNT(*)
    FROM    myTable
    WHERE   myCriteria = @myCriteria
    WHILE ( @count > 0 ) 
            RAISERROR (N'--> Rows to delete: %d', --This is message 
                      10, -- Severity,
                      1, -- State,
                      @count -- First argument. N'number', Second argument 5
                      ) WITH NOWAIT ;
            SELECT  @currentId = MIN(myTableKeyField)
            FROM    myTable
            WHERE   myCriteria = @myCriteria
            RAISERROR (N'--> Starting from ID: %d', --This is message 
                      10, -- Severity,
                      1, -- State,
                      @currentId -- First argument. N'number', Second argument 5
                      ) WITH NOWAIT ;
            DELETE  FROM myTable
            WHERE   myTableKeyField < @currentId + 1000000
                    AND myCriteria = @myCriteria
            SELECT  @count = COUNT(*)
            FROM    myTable
            WHERE   myCriteria = @myCriteria
            PRINT @count

    As you can read, compared to the “First Option“ this script does nothing else but deleting the rows with ID within Min(ID) and Min(ID) + 1.000.000. Off course this script assumes you have in your table a field set to Identity, with unique numeric values. If you have such a field good, otherwise you can add it.

    Problem of this option are mostly the same:

    • Lot of time;
    • Unless you don’t call a truncate log each, let’s say, 10 million rows, Lot of space.

    Third option: what I finally sorted out after banging my head for a while (or as i called it: Copy & Truncate)

    The main issue when you have to work with such an amount of data, is that you have to change a bit the point of observation of the problem. If you keep watching such tables like “normal” ones, you will end up inevitably with the first two options.

    But because these kind of tables are so big, it’s mostly impossible to deal with such a deletes and pretend to make them fast… unless you don’t turn the problem upside down, and you start thinking about the rows you want to keep instead of the rows you want to remove.

    So third option, and IMHO the most effective in terms of time and space is:

    First you create a table that is exactly the same copy of the “source” table, except primary keys with identity. Then you make an insert from the source, in order to preserve in a dedicated area all the rows you do NOT want to delete. Once you are sure (triple check your data please), you just run a Truncate on the table.

    Truncate, instead of Delete, simply “flush” all the rows in a table marking them as obsolete, ready to be recycled (it’s not important in this contest the deep difference between truncate and delete… we just need to know is FASTER. Damn FASTER than delete).

    Once you have cleaned up the mess, you are ready to perform a second insert from the backup into the source again. If you have identities, all you have to do is to set identity_insert to on before the insert, and the game is done.

    Final considerations

    It’s quite clear that these options are in some way a workaround to repair some accidental mess you have made on data, or something that went very wrong during some data import. But is useful to know, and most of all is useful to remind that a tool is as powerful as much as your ability to find new paths to get to the goal.

    Just for some reference, i have tried all the three methods (guess why? Sorriso) on a VM Single core 2.4Ghz 6Gb Ram. Here are the timings of the three options:

    Straight –> after more than 8 hours i was still waiting, and it filled my 100Gb HD for log(!);

    Splitted –> it made it, after 8 hours. Disk was mostly free.

    Copy & Truncate –> 10 minutes to copy the rows, 5 minutes (maximum) to truncate, 10 minutes to get data back!

    As last note before closing this posts, one thing: BEFORE ATTEMPTING ANYTHING LIKE THIS, REMEMBER TO MAKE A FULL BACKUP!

    Goodbye all, and thanks in advance for your comments,

    eabidi Sorriso