Assemblies, Metadata and Manifests

From http://msdn2.microsoft.com/en-us/library/hk5f40ct(VS.80).aspx.

Assemblies are the building blocks of .NET Framework applications; they form the fundamental unit of deployment, version control, reuse, activation scoping and security permissions. An assemblie is a collection of types and resourses that are built to work together and form a logical unit of functionality. An assembly provides the CLR with the information it needs to be aware of type implementations. To the runtime, a type does not exist outside of an assembly.

Assemblies Overview

Assemblies are a fundamental part of programming with the .NET Framework. An assembly performs the following functions:

It contains code that the common language runtime executes. Microsoft intermediate language (MSIL) code in a portable executable (PE) file will not be executed if it does not have an associated assembly manifest. Note that each assembly can have only one entry point (that is, DllMain, WinMain, or Main).
It forms a security boundary. An assembly is the unit at which permissions are requested and granted. For more information about security boundaries as they apply to assemblies, see Assembly Security Considerations.
It forms a type boundary. Every type's identity includes the name of the assembly in which it resides. A type called MyType loaded in the scope of one assembly is not the same as a type called MyType loaded in the scope of another assembly.
It forms a reference scope boundary. The assembly's manifest contains assembly metadata that is used for resolving types and satisfying resource requests. It specifies the types and resources that are exposed outside the assembly. The manifest also enumerates other assemblies on which it depends.
It forms a version boundary. The assembly is the smallest versionable unit in the common language runtime; all types and resources in the same assembly are versioned as a unit. The assembly's manifest describes the version dependencies you specify for any dependent assemblies. For more information about versioning, see Assembly Versioning.
It forms a deployment unit. When an application starts, only the assemblies that the application initially calls must be present. Other assemblies, such as localization resources or assemblies containing utility classes, can be retrieved on demand. This allows applications to be kept simple and thin when first downloaded. For more information about deploying assemblies, see Deploying Applications.
It is the unit at which side-by-side execution is supported. For more information about running multiple versions of the same assembly, see Side-by-Side Execution.
Assemblies can be static or dynamic. Static assemblies can include .NET Framework types (interfaces and classes), as well as resources for the assembly (bitmaps, JPEG files, resource files, and so on). Static assemblies are stored on disk in PE files. You can also use the .NET Framework to create dynamic assemblies, which are run directly from memory and are not saved to disk before execution. You can save dynamic assemblies to disk after they have executed.

There are several ways to create assemblies. You can use development tools, such as Visual Studio .NET, that you have used in the past to create .dll or .exe files. You can use tools provided in the .NET Framework SDK to create assemblies with modules created in other development environments. You can also use common language runtime APIs, such as Reflection.Emit, to create dynamic assemblies.

Assembly Benefits

Assemblies are designed to simplify application deployment and to solve versioning problems that can occur with component-based applications.

End users and developers are familiar with versioning and deployment issues that arise from today's component-based systems. Some end users have experienced the frustration of installing a new application on their computer, only to find that an existing application has suddenly stopped working. Many developers have spent countless hours trying to keep all necessary registry entries consistent in order to activate a COM class.

Many deployment problems have been solved by the use of assemblies in the .NET Framework. Because they are self-describing components that have no dependencies on registry entries, assemblies enable zero-impact application installation. They also simplify uninstalling and replicating applications.

Versioning Problems

Currently two versioning problems occur with Win32 applications:

Versioning rules cannot be expressed between pieces of an application and enforced by the operating system. The current approach relies on backward compatibility, which is often difficult to guarantee. Interface definitions must be static, once published, and a single piece of code must maintain backward compatibility with previous versions. Furthermore, code is typically designed so that only a single version of it can be present and executing on a computer at any given time.
There is no way to maintain consistency between sets of components that are built together and the set that is present at run time.

These two versioning problems combine to create DLL conflicts, where installing one application can inadvertently break an existing application because a certain software component or DLL was installed that was not fully backward compatible with a previous version. Once this situation occurs, there is no support in the system for diagnosing and fixing the problem.

An End to DLL Conflicts

Microsoft Windows 2000 began to fully address these problems. It provides two features that partially fix DLL conflicts:

Windows 2000 enables you to create client applications where the dependent .dll files are located in the same directory as the application's .exe file. Windows 2000 can be configured to check for a component in the directory where the .exe file is located before checking the fully qualified path or searching the normal path. This enables components to be independent of components installed and used by other applications.
Windows 2000 locks files that are shipped with the operating system in the System32 directory so they cannot be inadvertently replaced when applications are installed.

The common language runtime uses assemblies to continue this evolution toward a complete solution to DLL conflicts.

The Assembly Solution

To solve versioning problems, as well as the remaining problems that lead to DLL conflicts, the runtime uses assemblies to do the following:

Enable developers to specify version rules between different software components.
Provide the infrastructure to enforce versioning rules.
Provide the infrastructure to allow multiple versions of a component to be run simultaneously (called side-by-side execution).

Assembly Contents

In general, a static assembly can consist of four elements:

The assembly manifest, which contains assembly metadata.
Type metadata.
Microsoft intermediate language (MSIL) code that implements the types.
A set of resources.

Only the assembly manifest is required, but either types or resources are needed to give the assembly any meaningful functionality.

There are several ways to group these elements in an assembly. You can group all elements in a single physical file, which is shown in the following illustration.

Single-file assembly

Alternatively, the elements of an assembly can be contained in several files. These files can be modules of compiled code (.netmodule), resources (such as .bmp or .jpg files), or other files required by the application. Create a multifile assembly when you want to combine modules written in different languages and to optimize downloading an application by putting seldom used types in a module that is downloaded only when needed.

In the following illustration, the developer of a hypothetical application has chosen to separate some utility code into a different module and to keep a large resource file (in this case a .bmp image) in its original file. The .NET Framework downloads a file only when it is referenced; keeping infrequently referenced code in a separate file from the application optimizes code download.

Multifile assembly

Note The files that make up a multifile assembly are not physically linked by the file system. Rather, they are linked through the assembly manifest and the common language runtime manages them as a unit.

In this illustration, all three files belong to an assembly, as described in the assembly manifest contained in myAssembly.dll. To the file system, they are three separate files. Note that the file Util.netmodule was compiled as a module because it contains no assembly information. When the assembly was created, the assembly manifest was added to myAssembly.dll, indicating its relationship with Util.dll and Graphic.bmp.

When designing your source code today, you make explicit decisions about how to partition the functionality of your application into one or more files. When designing .NET Framework code, you will make similar decisions about how to partition the functionality into one or more assemblies.

Manifests

Every assembly, whether static or dynamic, contains a collection of data that describes how the elements in the assembly relate to each other. The assembly manifest contains this assembly metadata. An assembly manifest contains all the metadata needed to specify the assembly's version requirements and security identity, and all metadata needed to define the scope of the assembly and resolve references to resources and classes. The assembly manifest can be stored in either a portable executable (PE) file (an .exe or .dll) with Microsoft intermediate language (MSIL) code or in a standalone PE file that contains only assembly manifest information.

For an assembly with one associated file, the manifest is incorporated into the PE file to form a single-file assembly. You can create a multifile assembly with a standalone manifest file or with the manifest incorporated into one of the PE files in the assembly.

Each assembly's manifest performs the following functions:

Enumerates the files that make up the assembly.
Governs how references to the assembly's types and resources map to the files that contain their declarations and implementations.
Enumerates other assemblies on which the assembly depends.
Provides a level of indirection between consumers of the assembly and the assembly's implementation details.
Renders the assembly self-describing.

The following table shows the information contained in the assembly manifest. The first four items — the assembly name, version number, culture, and strong name information — make up the assembly's identity.

Information	Description
Assembly name	A text string specifying the assembly's name.
Version number	A major and minor version number, and a revision and build number. The common language runtime uses these numbers to enforce version policy.
Culture	Information on the culture or language the assembly supports. This information should be used only to designate an assembly as a satellite assembly containing culture- or language-specific information. (An assembly with culture information is automatically assumed to be a satellite assembly.)
Strong name information	The public key from the publisher if the assembly has been given a strong name.
List of all files in the assembly	A hash of each file contained in the assembly and a file name. Note that all files that make up the assembly must be in the same directory as the file containing the assembly manifest.
Type reference information	Information used by the runtime to map a type reference to the file that contains its declaration and implementation. This is used for types that are exported from the assembly.
Information on referenced assemblies	A list of other assemblies that are statically referenced by the assembly. Each reference includes the dependent assembly's name, assembly metadata (version, culture, operating system, and so on), and public key, if the assembly is strong named.

Global Assembly Cache

Each computer where the common language runtime is installed has a machine-wide code cache called the global assembly cache. The global assembly cache stores assemblies specifically designated to be shared by several applications on the computer.

You should share assemblies by installing them into the global assembly cache only when you need to. As a general guideline, keep assembly dependencies private and locate assemblies in the application directory unless sharing an assembly is explicitly required. In addition, you do not have to install assemblies into the global assembly cache to make them accessible to COM interop or unmanaged code.

Note There are scenarios where you explicitly do not want to install an assembly into the global assembly cache. If you place one of the assemblies that make up an application in the global assembly cache, you can no longer replicate or install the application by XCOPYing the application directory. You must move the assembly in the global assembly cache as well.

There are several ways to deploy an assembly into the global assembly cache:

Use an installer designed to work with the global assembly cache. This is the preferred option for installing assemblies into the global assembly cache.
Use a developer tool called the Global Assembly Cache tool (Gacutil.exe) provided by the .NET Framework SDK.
Use Windows Explorer to drag and drop assemblies into the cache.
Note In deployment scenarios, use Windows Installer 2.0 to install assemblies into the global assembly cache. Use Windows Explorer or the Global Assembly Cache tool only in development scenarios because they do not provide assembly reference counting and other features provided when using the Windows Installer.

Administrators often protect the WINNT directory using an Access Control List (ACL) to control write and execute access. Because the global assembly cache is installed in the WINNT directory, it inherits that directory's ACL. It is recommended that only users with Administrator privileges be allowed to delete files from the global assembly cache.

Assemblies deployed in the global assembly cache must have a strong name. When an assembly is added to the global assembly cache, integrity checks are performed on all files that make up the assembly. The cache performs these integrity checks to ensure that an assembly has not been tampered with (for example, when a file has changed but the manifest does not reflect the change).

Strong-named Assemblies

A strong name consists of the assembly's identity — its simple text name, version number, and culture information (if provided) — plus a public key and a digital signature. It is generated from an assembly file (the file that contains the assembly manifest, which in turn contains the names and hashes of all the files that make up the assembly), using the corresponding private key. Microsoft® Visual Studio .NET® and other development tools provided in the .NET Framework SDK can assign strong names to an assembly. Assemblies with the same strong name are expected to be identical.

You can ensure that a name is globally unique by signing an assembly with a strong name. In particular, strong names satisfy the following requirements:

Strong names guarantee name uniqueness by relying on unique key pairs. No one can generate the same assembly name that you can, because an assembly generated with one private key has a different name than an assembly generated with another private key.
Strong names protect the version lineage of an assembly. A strong name can ensure that no one can produce a subsequent version of your assembly. Users can be sure that a version of the assembly they are loading comes from the same publisher that created the version the application was built with.
Strong names provide a strong integrity check. Passing the .NET Framework security checks guarantees that the contents of the assembly have not been changed since it was built. Note, however, that strong names in and of themselves do not imply a level of trust, such as that provided by a digital signature and supporting certificate.

When you reference a strong-named assembly, you expect to get certain benefits, such as versioning and naming protection. If the strong-named assembly then references an assembly with a simple name, which does not have these benefits, you lose the benefits you would derive from using a strong-named assembly and revert to DLL conflicts. Therefore, strong-named assemblies can only reference other strong-named assemblies.

Assembly Security

When you build an assembly, you can specify a set of permissions that the assembly requires to run. Optional permissions can be granted by the security policy set on the computer where the assembly will run. If you want your code to handle all potential security exceptions, you can do one of the following:

Insert a permission request for all the permissions your code must have, and handle the load-time failure up front that occurs if the permissions are not granted.
Do not use a permission request to obtain permissions your code might need, but be prepared to handle security exceptions if permissions are not granted.

Note Security is a complex area and you have many options to choose from. For more information, see Key Security Concepts.

At load time, the assembly's evidence is used as input to security policy. Security policy is established by the enterprise and the computer's administrator as well as by user policy settings, and determines the set of permissions that is granted to all managed code when executed. Security policy can be established for the publisher of the assembly (if it has a signcode signature), for the Web site and zone (in Internet Explorer terms) the assembly was downloaded from, or for the assembly's strong name. For example, a computer's administrator can establish security policy that allows all code downloaded from a Web site and signed by a given software company to access a database on a computer, but does not grant access to write to the computer's disk.

Strong-Named Assemblies and Signcode

You can sign an assembly in two different, but complementary ways: with a strong name or using Signcode.exe. Signing an assembly with a strong name adds a public key encryption to the file containing the assembly manifest. Strong name signing ensures name uniqueness, prevents name spoofing and provides callers with some identity when a reference is resolved.

However, no level of trust is associated with a strong name, which makes signcode important. Signcode requires a publisher to prove their identity to a third-party authority and obtain a certificate. This certificate is then embedded in your file and can be used by an administrator to decide whether to trust the code's authenticity.

You can give both a strong name and a signcode digital signature to an assembly, or you can use either alone. Signcode can sign only one file at a time; for a multifile assembly, you sign the file that contains the assembly manifest. A strong name is stored in the file containing the assembly manifest, but a signcode signature is stored in a reserved slot in the portable executable (PE) file containing the assembly manifest. Signcode signing of an assembly can be used (with or without a strong name) when you already have a trust hierarchy that relies on signcode signatures, or when your policy uses only the key portion and does not check a chain of trust.

Note When using both a strong name and a signcode signature on an assembly, the strong name must be assigned first.

The common language runtime also performs a hash verification; the assembly manifest contains a list of all files that make up the assembly, including a hash of each file as it existed when the manifest was built. As each file is loaded, its contents are hashed and compared with the hash value stored in the manifest. If the two hashes do not match, the assembly fails to load.

Because strong naming and signcodes guarantee integrity, you can base code access security policy on these two forms of assembly evidence. Strong naming and signcode signing guarantee integrity through digital signatures and certificates. All the technologies mentioned—hash verification, strong naming, and signcodes—work together to ensure that the assembly has not been altered in any way.

Assembly Versioning

All versioning of assemblies that use the common language runtime is done at the assembly level. The specific version of an assembly and the versions of dependent assemblies are recorded in the assembly's manifest. The default version policy for the runtime is that applications run only with the versions they were built and tested with, unless overridden by explicit version policy in configuration files (the application configuration file, the publisher policy file, and the computer's administrator configuration file).

Note Versioning is done only on assemblies with strong names.

The runtime performs several steps to resolve an assembly binding request:

Checks the original assembly reference to determine the version of the assembly to be bound.
Checks for all applicable configuration files to apply version policy.
Determines the correct assembly from the original assembly reference and any redirection specified in the configuration files, and determines the version that should be bound to the calling assembly.
Checks the global assembly cache, codebases specified in configuration files, and then checks the application's directory and subdirectories using the probing rules explained in How the Runtime Locates Assemblies.

The following illustration shows these steps.

Resolving an assembly binding request

Assembly Version Number

Each assembly has a version number as part of its identity. As such, two assemblies that differ by version number are considered by the runtime to be completely different assemblies. This version number is physically represented as a four-part number with the following format:

<major version>.<minor version>.<build number>.<revision>

For example, version 1.5.1254.0 indicates 1 as the major version, 5 as the minor version, 1254 as the build number, and 0 as the revision number.

The version number is stored in the assembly manifest along with other identity information, including the assembly name and public key, as well as information on relationships and identities of other assemblies connected with the application.

When an assembly is built, the development tool records dependency information for each assembly that is referenced in the assembly manifest. The runtime uses these version numbers, in conjunction with configuration information set by an administrator, an application, or a publisher, to load the proper version of a referenced assembly.

The runtime distinguishes between regular and strong-named assemblies for the purposes of versioning. Version checking only occurs with strong-named assemblies.

For information about specifying version binding policies, see Configuration Files. For information about how the runtime uses version information to find a particular assembly, see How the Runtime Locates Assemblies.

How the Runtime Locates Assemblies

To successfully deploy your .NET Framework application, you must understand how the common language runtime locates and binds to the assemblies that make up your application. By default, the runtime attempts to bind with the exact version of an assembly that the application was built with. This default behavior can be overridden by configuration file settings.

The common language runtime performs a number of steps when attempting to locate an assembly and resolve an assembly reference. Each step is explained in the following sections. The term probing is often used when describing how the runtime locates assemblies; it refers to the set of heuristics used to locate the assembly based on its name and culture.

Note You can view binding information in the log file using the Assembly Binding Log Viewer (Fuslogvw.exe), which is included in the .NET Framework SDK.

Initiating the Bind

The process of locating and binding to an assembly begins when the runtime attempts to resolve a reference to another assembly. This reference can be either static or dynamic. The compiler records static references in the assembly manifest's metadata at build time. Dynamic references are constructed on the fly as a result of calling various methods, such as System.Reflection.Assembly.Load.

The preferred way to reference an assembly is to use a full reference, including the assembly name, version, culture, and public key token (if one exists). The runtime uses this information to locate the assembly, following the steps described later in this section. The runtime uses the same resolution process regardless of whether the reference is for a static or dynamic assembly.

You can also make a dynamic reference to an assembly by providing the calling method with only partial information about the assembly, such as specifying only the assembly name. In this case, only the application directory is searched for the assembly, and no other checking occurs. You make a partial reference using any of the various methods for loading assemblies such as System.Reflection.Assembly.Load or AppDomain.Load. If you want the runtime to check the global assembly cache as well as the application directory for a referenced assembly, you can specify a partial reference using the System.Reflection.Assembly.LoadWithPartialName method. For more information about partial binding, see Partial Assembly References.

Finally, you can make a dynamic reference using a method such as System.Reflection.Assembly.Load and provide only partial information; you then qualify the reference using the <qualifyAssembly> element in the application configuration file. This element allows you to provide the full reference information (name, version, culture and, if applicable, the public key token) in your application configuration file instead of in your code. You would use this technique if you wanted to fully qualify a reference to an assembly outside the application directory, or if you wanted to reference an assembly in the global assembly cache but you wanted the convenience of specifying the full reference in the configuration file instead of in your code.

Note This type of partial reference should not be used with assemblies that are shared among several applications. Because configuration settings are applied per application and not per assembly, a shared assembly using this type of partial reference would require each application using the shared assembly to have the qualifying information in its configuration file.

The runtime uses the following steps to resolve an assembly reference:

Determines the correct assembly version by examining applicable configuration files, including the application configuration file, publisher policy file, and machine configuration file. If the configuration file is located on a remote machine, the runtime must locate and download the application configuration file first.
Checks whether the assembly name has been bound to before and, if so, uses the previously loaded assembly.
Checks the global assembly cache. If the assembly is found there, the runtime uses this assembly.
Probes for the assembly using the following steps:
1. If configuration and publisher policy do not affect the original reference and if the bind request was created using the Assembly.LoadFrom method, the runtime checks for location hints.
2. If a codebase is found in the configuration files, the runtime checks only this location. If this probe fails, the runtime determines that the binding request failed and no other probing occurs.
3. Probes for the assembly using the heuristics described in the probing section. If the assembly is not found after probing, the runtime requests the Windows Installer to provide the assembly. This acts as an install-on-demand feature.
Note There is no version checking for assemblies without strong names, nor does the runtime check in the global assembly cache for assemblies without strong names.

Side-by-side Execution

Side-by-side execution is the ability to run multiple versions of the same assembly simultaneously. The common language runtime provides the infrastructure that allows multiple versions of the same assembly to run on the same computer, or even in the same process.

Code that is capable of side-by-side execution more flexibly provides compatibility with previous versions. Components that can run side by side do not have to maintain strict backward compatibility. For example, consider a class MyClassX that supports side-by-side execution. An incompatibility is introduced between versions 1 and 2. Callers of MyClassX that express a dependency on version 1 will always get version 1, regardless of how many subsequent versions of MyClassX are installed on the computer. A caller gets version 2 only if it specifically upgrades its version policy.

Support for side-by-side storage and execution of different versions of the same assembly is an integral part of versioning and is built into the infrastructure of the runtime. Because the assembly's version number is part of its identity, the runtime can store multiple versions of the same assembly in the global assembly cache and load those assemblies at run time.

There are two types of side-by-side execution:

Running on the same computer.
In this type of side-by-side execution, multiple versions of the same application run on the same computer at the same time without interfering with each other. An application that supports side-by-side execution on the same computer demands careful coding. For example, consider an application that uses a file at a specific location for a cache. The application must either handle multiple versions of the application accessing the file at the same time or the application must remove the dependency on the specific location and allow each version to have its own cache.
Running in the same process.
To successfully run side by side in the same process, multiple versions cannot have any strict dependencies on process-wide resources. Side-by-side execution in the same process means that single applications with multiple dependencies can run multiple versions of the same component, and Web servers and other hosts can run applications with possibly conflicting dependencies in the same process.

Although the runtime provides you with the ability to create side-by-side applications, side-by-side execution is not automatic. You still must take great care in the code when creating applications intended to run side by side.