Assemblies, Metadata and Manifests

 

From http://msdn2.microsoft.com/en-us/library/hk5f40ct(VS.80).aspx.

Assemblies are the building blocks of .NET Framework applications; they form the fundamental unit of deployment, version control, reuse, activation scoping and security permissions. An assemblie is a collection of types and resourses that are built to work together and form a logical unit of functionality. An assembly provides the CLR with the information it needs to be aware of type implementations. To the runtime, a type does not exist outside of an assembly.

 

Assemblies Overview

Assemblies are a fundamental part of programming with the .NET Framework. An assembly performs the following functions:

 

Assembly Benefits

Assemblies are designed to simplify application deployment and to solve versioning problems that can occur with component-based applications.

End users and developers are familiar with versioning and deployment issues that arise from today's component-based systems. Some end users have experienced the frustration of installing a new application on their computer, only to find that an existing application has suddenly stopped working. Many developers have spent countless hours trying to keep all necessary registry entries consistent in order to activate a COM class.

Many deployment problems have been solved by the use of assemblies in the .NET Framework. Because they are self-describing components that have no dependencies on registry entries, assemblies enable zero-impact application installation. They also simplify uninstalling and replicating applications.

 

Versioning Problems

Currently two versioning problems occur with Win32 applications:

These two versioning problems combine to create DLL conflicts, where installing one application can inadvertently break an existing application because a certain software component or DLL was installed that was not fully backward compatible with a previous version. Once this situation occurs, there is no support in the system for diagnosing and fixing the problem.

An End to DLL Conflicts

Microsoft Windows 2000 began to fully address these problems. It provides two features that partially fix DLL conflicts:

The common language runtime uses assemblies to continue this evolution toward a complete solution to DLL conflicts.

The Assembly Solution

To solve versioning problems, as well as the remaining problems that lead to DLL conflicts, the runtime uses assemblies to do the following:

 

Assembly Contents

In general, a static assembly can consist of four elements:

Only the assembly manifest is required, but either types or resources are needed to give the assembly any meaningful functionality.

There are several ways to group these elements in an assembly. You can group all elements in a single physical file, which is shown in the following illustration.

Single-file assembly

Alternatively, the elements of an assembly can be contained in several files. These files can be modules of compiled code (.netmodule), resources (such as .bmp or .jpg files), or other files required by the application. Create a multifile assembly when you want to combine modules written in different languages and to optimize downloading an application by putting seldom used types in a module that is downloaded only when needed.

In the following illustration, the developer of a hypothetical application has chosen to separate some utility code into a different module and to keep a large resource file (in this case a .bmp image) in its original file. The .NET Framework downloads a file only when it is referenced; keeping infrequently referenced code in a separate file from the application optimizes code download.

Multifile assembly

Note The files that make up a multifile assembly are not physically linked by the file system. Rather, they are linked through the assembly manifest and the common language runtime manages them as a unit.

In this illustration, all three files belong to an assembly, as described in the assembly manifest contained in myAssembly.dll. To the file system, they are three separate files. Note that the file Util.netmodule was compiled as a module because it contains no assembly information. When the assembly was created, the assembly manifest was added to myAssembly.dll, indicating its relationship with Util.dll and Graphic.bmp.

When designing your source code today, you make explicit decisions about how to partition the functionality of your application into one or more files. When designing .NET Framework code, you will make similar decisions about how to partition the functionality into one or more assemblies.

 

Manifests

Every assembly, whether static or dynamic, contains a collection of data that describes how the elements in the assembly relate to each other. The assembly manifest contains this assembly metadata. An assembly manifest contains all the metadata needed to specify the assembly's version requirements and security identity, and all metadata needed to define the scope of the assembly and resolve references to resources and classes. The assembly manifest can be stored in either a portable executable (PE) file (an .exe or .dll) with Microsoft intermediate language (MSIL) code or in a standalone PE file that contains only assembly manifest information.

For an assembly with one associated file, the manifest is incorporated into the PE file to form a single-file assembly. You can create a multifile assembly with a standalone manifest file or with the manifest incorporated into one of the PE files in the assembly.

Each assembly's manifest performs the following functions:

The following table shows the information contained in the assembly manifest. The first four items — the assembly name, version number, culture, and strong name information — make up the assembly's identity.

Information Description
Assembly name A text string specifying the assembly's name.
Version number A major and minor version number, and a revision and build number. The common language runtime uses these numbers to enforce version policy.
Culture Information on the culture or language the assembly supports. This information should be used only to designate an assembly as a satellite assembly containing culture- or language-specific information. (An assembly with culture information is automatically assumed to be a satellite assembly.)
Strong name information The public key from the publisher if the assembly has been given a strong name.
List of all files in the assembly A hash of each file contained in the assembly and a file name. Note that all files that make up the assembly must be in the same directory as the file containing the assembly manifest.
Type reference information Information used by the runtime to map a type reference to the file that contains its declaration and implementation. This is used for types that are exported from the assembly.
Information on referenced assemblies A list of other assemblies that are statically referenced by the assembly. Each reference includes the dependent assembly's name, assembly metadata (version, culture, operating system, and so on), and public key, if the assembly is strong named.
 

Global Assembly Cache

Each computer where the common language runtime is installed has a machine-wide code cache called the global assembly cache. The global assembly cache stores assemblies specifically designated to be shared by several applications on the computer.

You should share assemblies by installing them into the global assembly cache only when you need to. As a general guideline, keep assembly dependencies private and locate assemblies in the application directory unless sharing an assembly is explicitly required. In addition, you do not have to install assemblies into the global assembly cache to make them accessible to COM interop or unmanaged code.

Note There are scenarios where you explicitly do not want to install an assembly into the global assembly cache. If you place one of the assemblies that make up an application in the global assembly cache, you can no longer replicate or install the application by XCOPYing the application directory. You must move the assembly in the global assembly cache as well.


There are several ways to deploy an assembly into the global assembly cache:

Administrators often protect the WINNT directory using an Access Control List (ACL) to control write and execute access. Because the global assembly cache is installed in the WINNT directory, it inherits that directory's ACL. It is recommended that only users with Administrator privileges be allowed to delete files from the global assembly cache.

Assemblies deployed in the global assembly cache must have a strong name. When an assembly is added to the global assembly cache, integrity checks are performed on all files that make up the assembly. The cache performs these integrity checks to ensure that an assembly has not been tampered with (for example, when a file has changed but the manifest does not reflect the change).

 

Strong-named Assemblies

A strong name consists of the assembly's identity — its simple text name, version number, and culture information (if provided) — plus a public key and a digital signature. It is generated from an assembly file (the file that contains the assembly manifest, which in turn contains the names and hashes of all the files that make up the assembly), using the corresponding private key. Microsoft® Visual Studio .NET® and other development tools provided in the .NET Framework SDK can assign strong names to an assembly. Assemblies with the same strong name are expected to be identical.

You can ensure that a name is globally unique by signing an assembly with a strong name. In particular, strong names satisfy the following requirements:

When you reference a strong-named assembly, you expect to get certain benefits, such as versioning and naming protection. If the strong-named assembly then references an assembly with a simple name, which does not have these benefits, you lose the benefits you would derive from using a strong-named assembly and revert to DLL conflicts. Therefore, strong-named assemblies can only reference other strong-named assemblies.

 

Assembly Security

When you build an assembly, you can specify a set of permissions that the assembly requires to run. Optional permissions can be granted by the security policy set on the computer where the assembly will run. If you want your code to handle all potential security exceptions, you can do one of the following:

Note Security is a complex area and you have many options to choose from. For more information, see Key Security Concepts.


At load time, the assembly's evidence is used as input to security policy. Security policy is established by the enterprise and the computer's administrator as well as by user policy settings, and determines the set of permissions that is granted to all managed code when executed. Security policy can be established for the publisher of the assembly (if it has a signcode signature), for the Web site and zone (in Internet Explorer terms) the assembly was downloaded from, or for the assembly's strong name. For example, a computer's administrator can establish security policy that allows all code downloaded from a Web site and signed by a given software company to access a database on a computer, but does not grant access to write to the computer's disk.

Strong-Named Assemblies and Signcode

You can sign an assembly in two different, but complementary ways: with a strong name or using Signcode.exe. Signing an assembly with a strong name adds a public key encryption to the file containing the assembly manifest. Strong name signing ensures name uniqueness, prevents name spoofing and provides callers with some identity when a reference is resolved.

However, no level of trust is associated with a strong name, which makes signcode important. Signcode requires a publisher to prove their identity to a third-party authority and obtain a certificate. This certificate is then embedded in your file and can be used by an administrator to decide whether to trust the code's authenticity.

You can give both a strong name and a signcode digital signature to an assembly, or you can use either alone. Signcode can sign only one file at a time; for a multifile assembly, you sign the file that contains the assembly manifest. A strong name is stored in the file containing the assembly manifest, but a signcode signature is stored in a reserved slot in the portable executable (PE) file containing the assembly manifest. Signcode signing of an assembly can be used (with or without a strong name) when you already have a trust hierarchy that relies on signcode signatures, or when your policy uses only the key portion and does not check a chain of trust.

Note When using both a strong name and a signcode signature on an assembly, the strong name must be assigned first.

The common language runtime also performs a hash verification; the assembly manifest contains a list of all files that make up the assembly, including a hash of each file as it existed when the manifest was built. As each file is loaded, its contents are hashed and compared with the hash value stored in the manifest. If the two hashes do not match, the assembly fails to load.

Because strong naming and signcodes guarantee integrity, you can base code access security policy on these two forms of assembly evidence. Strong naming and signcode signing guarantee integrity through digital signatures and certificates. All the technologies mentioned—hash verification, strong naming, and signcodes—work together to ensure that the assembly has not been altered in any way.

 

Assembly Versioning

All versioning of assemblies that use the common language runtime is done at the assembly level. The specific version of an assembly and the versions of dependent assemblies are recorded in the assembly's manifest. The default version policy for the runtime is that applications run only with the versions they were built and tested with, unless overridden by explicit version policy in configuration files (the application configuration file, the publisher policy file, and the computer's administrator configuration file).

Note Versioning is done only on assemblies with strong names.

The runtime performs several steps to resolve an assembly binding request:

The following illustration shows these steps.

Resolving an assembly binding request

Assembly Version Number

Each assembly has a version number as part of its identity. As such, two assemblies that differ by version number are considered by the runtime to be completely different assemblies. This version number is physically represented as a four-part number with the following format:

<major version>.<minor version>.<build number>.<revision>

For example, version 1.5.1254.0 indicates 1 as the major version, 5 as the minor version, 1254 as the build number, and 0 as the revision number.

The version number is stored in the assembly manifest along with other identity information, including the assembly name and public key, as well as information on relationships and identities of other assemblies connected with the application.

When an assembly is built, the development tool records dependency information for each assembly that is referenced in the assembly manifest. The runtime uses these version numbers, in conjunction with configuration information set by an administrator, an application, or a publisher, to load the proper version of a referenced assembly.

The runtime distinguishes between regular and strong-named assemblies for the purposes of versioning. Version checking only occurs with strong-named assemblies.

For information about specifying version binding policies, see Configuration Files. For information about how the runtime uses version information to find a particular assembly, see How the Runtime Locates Assemblies.

 

How the Runtime Locates Assemblies

To successfully deploy your .NET Framework application, you must understand how the common language runtime locates and binds to the assemblies that make up your application. By default, the runtime attempts to bind with the exact version of an assembly that the application was built with. This default behavior can be overridden by configuration file settings.

The common language runtime performs a number of steps when attempting to locate an assembly and resolve an assembly reference. Each step is explained in the following sections. The term probing is often used when describing how the runtime locates assemblies; it refers to the set of heuristics used to locate the assembly based on its name and culture.

Note You can view binding information in the log file using the Assembly Binding Log Viewer (Fuslogvw.exe), which is included in the .NET Framework SDK.

Initiating the Bind

The process of locating and binding to an assembly begins when the runtime attempts to resolve a reference to another assembly. This reference can be either static or dynamic. The compiler records static references in the assembly manifest's metadata at build time. Dynamic references are constructed on the fly as a result of calling various methods, such as System.Reflection.Assembly.Load.

The preferred way to reference an assembly is to use a full reference, including the assembly name, version, culture, and public key token (if one exists). The runtime uses this information to locate the assembly, following the steps described later in this section. The runtime uses the same resolution process regardless of whether the reference is for a static or dynamic assembly.

You can also make a dynamic reference to an assembly by providing the calling method with only partial information about the assembly, such as specifying only the assembly name. In this case, only the application directory is searched for the assembly, and no other checking occurs. You make a partial reference using any of the various methods for loading assemblies such as System.Reflection.Assembly.Load or AppDomain.Load. If you want the runtime to check the global assembly cache as well as the application directory for a referenced assembly, you can specify a partial reference using the System.Reflection.Assembly.LoadWithPartialName method. For more information about partial binding, see Partial Assembly References.

Finally, you can make a dynamic reference using a method such as System.Reflection.Assembly.Load and provide only partial information; you then qualify the reference using the <qualifyAssembly> element in the application configuration file. This element allows you to provide the full reference information (name, version, culture and, if applicable, the public key token) in your application configuration file instead of in your code. You would use this technique if you wanted to fully qualify a reference to an assembly outside the application directory, or if you wanted to reference an assembly in the global assembly cache but you wanted the convenience of specifying the full reference in the configuration file instead of in your code.

Note This type of partial reference should not be used with assemblies that are shared among several applications. Because configuration settings are applied per application and not per assembly, a shared assembly using this type of partial reference would require each application using the shared assembly to have the qualifying information in its configuration file.

The runtime uses the following steps to resolve an assembly reference:

  1. Determines the correct assembly version by examining applicable configuration files, including the application configuration file, publisher policy file, and machine configuration file. If the configuration file is located on a remote machine, the runtime must locate and download the application configuration file first.

  2. Checks whether the assembly name has been bound to before and, if so, uses the previously loaded assembly.

  3. Checks the global assembly cache. If the assembly is found there, the runtime uses this assembly.

  4. Probes for the assembly using the following steps:

    1. If configuration and publisher policy do not affect the original reference and if the bind request was created using the Assembly.LoadFrom method, the runtime checks for location hints.

    2. If a codebase is found in the configuration files, the runtime checks only this location. If this probe fails, the runtime determines that the binding request failed and no other probing occurs.

    3. Probes for the assembly using the heuristics described in the probing section. If the assembly is not found after probing, the runtime requests the Windows Installer to provide the assembly. This acts as an install-on-demand feature.

    Note There is no version checking for assemblies without strong names, nor does the runtime check in the global assembly cache for assemblies without strong names.

 

Side-by-side Execution

Side-by-side execution is the ability to run multiple versions of the same assembly simultaneously. The common language runtime provides the infrastructure that allows multiple versions of the same assembly to run on the same computer, or even in the same process.

Code that is capable of side-by-side execution more flexibly provides compatibility with previous versions. Components that can run side by side do not have to maintain strict backward compatibility. For example, consider a class MyClassX that supports side-by-side execution. An incompatibility is introduced between versions 1 and 2. Callers of MyClassX that express a dependency on version 1 will always get version 1, regardless of how many subsequent versions of MyClassX are installed on the computer. A caller gets version 2 only if it specifically upgrades its version policy.

Support for side-by-side storage and execution of different versions of the same assembly is an integral part of versioning and is built into the infrastructure of the runtime. Because the assembly's version number is part of its identity, the runtime can store multiple versions of the same assembly in the global assembly cache and load those assemblies at run time.

There are two types of side-by-side execution:

Although the runtime provides you with the ability to create side-by-side applications, side-by-side execution is not automatic. You still must take great care in the code when creating applications intended to run side by side.