How do I get .Net managed code to retrieve a string from a native DLL?

If you have to do any interop with native code, sooner or later you'll come across a situation where you need to work with a function which expects a C-style string buffer it can fill with some data. Many windows functions themselves have this kind of pattern. So let's say we have an old DLL called GenUtils.Dll, which exposes a function like this:

BOOL __declspec(dllexport) __stdcall GenUtils_GetSomeString (int   nStringReference,
                                                             char *szValue,
                                                             int   nBuffLen);

It takes a identifier for a string and puts the string in the buffer supplied, which is nBufflen characters long. How do we get a .Net assembly interfacing to this kind of function? There are two parts to the solution. The first is to get the DllImport attributes correctly defined when we write the wrapper assembly. The second is to use a StringBuilder object. The StringBuilder is necessary because strings are immutable in the .Net world. So first let's create a .Net wrapper library we can put around our old DLL. To illustrate some points, I'm going to pretend the Dll uses a non-standard calling convention, and is ANSI rather than Unicode.

namespace MyOldUtils 
{
   public class oldstuff
   {
      // NOTE: these are static members, you don't need to instantiate a 
      //       object to use them, just qualify the reference with
      //       the class name, e.g. oldstuff.GetSomeString (...)
         
      [DllImport ("GenUtils.dll", 
                  EntryPoint = "GenUtils_GetSomeString", 
                  ExactSpelling = false, 
                  CharSet = CharSet.Ansi, 
                  CallingConvention = CallingConvention.StdCall)] 
      public static extern bool GetSomeString (int nStringRef,
                                               [Out][MarshalAs (UnmanagedType.LPStr)] StringBuilder sbData,
                                               int nBuffLen); 
      ... other code ... 
   }
}

Notice that since the old DLL is ANSI rather than Unicode, we need to specify the charset correctly, and modify the unmanaged type (these elements are shown in red above). This means .Net does some of the heavy lifting for us. Note also that since this DLL was defined to use the StdCall calling convention (say, because it was designed to be callable from Delphi code for instance) rather than the default C convention of cdecl, we need a CallingConvention attribute to specify stdcall. If you don't understand calling conventions, and how they apply in .Net calls like this, see here. The same StdCall calling convention would apply if you were doing a PInvoke for a Windows library, because StdCall is the default convention for Windows itself, even though large parts of Windows are in C. The primary reason C defaults to cdecl is in order to support variadic functions. The MarshalAs gubbins there is the thing that gets the job does as far as our string is concerned.

So now let's see how some code in the final .Net program can access the wrapper. Let's say we have a winform testbed, and a button in the form invokes our retrieval code, gets the string index to use from a numeric up-down (spinner) control called udStringNum and puts the result in listbox lbOutput:

using MyOldUtils;
...
private void btnGetString_Click (object sender, EventArgs e)
{
   string sOutput;
   StringBuilder sbResult = new StringBuilder (200);
   
   oldstuff.GetSomeString (udStringNum.Value, sbResult, sbResult.Capacity);
   sOutput = "legacy dll string " + 
             udStringNum.Value.ToString() + 
             " has the value: '" + 
             sbResult.ToString() + "'";
   lbOutput.Items.Add (sOutput);
}

That's all there is to it.