Unicode Transformation
UTF stands for Unicode Transformation Format. There are many types of Unicode transformation format (also called Unicode encoding styles) such as UTF-7, UTF-8, UTF-16, UTF-32 etc. Here 7, 8, 16 and 32 refer to the number of bits used to encode or convert a character of the character set mapped to code point into the format code.
First of all, given a character of the character set, it is mapped to a number (called code point). This number is encoded into sequence of bits (depending upon the number of bits used to do the transformation, encoding format is decided). Encoding means converting a character code point into sequence of bits of 0 and 1. Decoding is inverse of encoding. Decoding means getting back the code point from sequence of bits of 0 and 1.
Therefore,
- Each character is mapped to a number, called code point.
- The code point is encoded into sequence of bits; the number of bits used depends upon the transformation format.
- Encoding means converting a character into sequence of bits.
- Decoding means converting a sequence of bits into a character.
Character set in HTML
In HTML documents, the character set used in the document is described using charset attribute of the meta tag. For example in HTML5, <meta charset="UTF-8"> implies that the UTF-8 is used in the document.
using System;
namespace EncodingDecoding
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("First Approach: Char.ConvertToUtf32 & Char.ConvertFromUtf32");
string alphabets = "The quick brown fox jumps over the lazy dog";
for (int i = 0; i < alphabets.Length; i++)
{
Int32 x = Char.ConvertToUtf32(alphabets, i);
Console.Write(x);
Console.Write(" integer represents ");
Console.Write(Char.ConvertFromUtf32(x));
Console.WriteLine();
}
Console.WriteLine("\nSecond Approach : Typecast");
string quote = "He who has a why to live can bear almost any how.";
char[] chars = quote.ToCharArray();
foreach (var c in chars)
{
Console.WriteLine(c+" "+ (int)c);
}
Console.ReadKey();
}
}
}
OUTPUT:
First Approach: Char.ConvertToUtf32 & Char.ConvertFromUtf32
84 integer represents T
104 integer represents h
101 integer represents e
32 integer represents
113 integer represents q
---
Second Approach : Typecast
H 72
e 101
32
w 119
h 104
---
C# Example to convert a character into string
using System;
using System.Reflection;
class Example
{
static void Main()
{
Type t = typeof(String);
MethodInfo substr = t.GetMethod("Substring",
new Type[] { typeof(int), typeof(int) });
Object result =
substr.Invoke("Hello, World!", new Object[] { 7, 5 });
Console.WriteLine("{0} returned \"{1}\".", substr, result);
Console.ReadKey();
}
}
C# Char
using System;
namespace ConsoleToString
{
class Program
{
static void Main(string[] args)
{
char ch = 'A';
Console.WriteLine(ch.ToString()); // Non-static method Output: "A"
Console.WriteLine(Char.ToString('B')); // static Method Output: "B"
Console.ReadKey();
}
}
}
No comments:
Post a Comment