The Maldives Child Protection Database was officially launched on 24 November 2010 with Maldives Vice-President, Dr Mohamed Waheed Hassan Manik as the chief guest at the ceremony.
Mohamed Naeem, Child Protection Specialist in the UNICEF Maldives Country Office, describes this event as marking "another milestone in the work that we are doing to improve the situation of children in Maldives and in or work to achieving the right to protection for Maldivian children". He also noted that "The good work done by Illuminaries was highlighted by everyone. I personally would like to highlight the extra mile you went to get to this point. So a very special thanks to you both [Tony and Jim]".
Following the launch we have received and been copied into a number of emails from others who have had an involvement in the project.
From Joshua Kakaire - formerly UNICEF Maldives Country Office
"Congratulations Naeem and team, it is more your resilience and relentless efforts that have yielded this historic achievement. Now the real stuff starts; populating it, generating reports and utilising the data for the betterment of children and women in 'Paradise'..."
From Manssor Ali - Special Advisor, UNICEF , New York
"Naeem your hard work and persistence has paid-off. I would like to express my thanks to all of you as well as Illuminaires for their dedication and patience in delivering a user sensitive system that will help in transforming the work of the Government to plan and effectively deliver protection measures to the troubled communities and families in Maldives."
From Abheet Solomon - UNICEF Tanzania Country Office
"Congratulations to you, UNICEF Maldives team(s), Govt Ministries and Illuminaries for making this happen!!! Reminds me of an old African proverb "Ideas are worth a dime dozen, but people who put them into action are priceless"
There aren't very many examples of such integrated multi-sectoral systems, and the launch of this is excellent news."
Friday, 3 December 2010
Thursday, 27 May 2010
Report Viewer Control
I thought it would be useful to blog something about the ReportViewer control which was new to VS 2008. This control takes rdlc report definitions, client based versions of the rdl report definitions of SQL Server Reporting Services. Designing the reports is largely the same as in SSRS but there are some gnarly issues when using the ReportViewer control which have taken some experimentation and Googling around to overcome. The following relates to ReportViewer in VB win forms.Mostly it applies to asp.net web forms as well.
What is the ReportViewer control?
Well it is just that, a control, and it has to be placed in a form. It is in effect a container for a report. The wizard allows you to pick a report and you can change it later via the properties box for the control but there are issues with data sources after that, which I'll cover later.
If you want to specify parameter values, you have to add those to your form as text boxes or combo lists and pass through the values through to the report. That's where the fun starts.
I've found it's often easier to view the report definition as xml and edit this directly, especially if you need to change the sql definition for the data source, for instance if you've changed the data type of a field.
Filtering
There are two ways of doing this: by passing the filter as a parameter through to the report or by filtering the report dataset loaded with the form.
I never got the first method to work. I might revisit the problem now I have a greater understanding of the ReportViewer control. The control is quite unforgiving to coding errors and I've found that most problems are related to data in one way or another.
In a report which shows overdue actions for a case worker, I wanted to be able to filter the report by Case Worker. The easiest way I found of doing this was to select the case worker from a list on my form and then refill the table adaptor and refresh the report viewer.
Dim awid As Integer = cboCaseWorker.SelectedItem
Me.taCaseworkerActivityReport.Fill(Me.dsReport.spCaseworkerActivityReport, Me.dtStartDate.Value, Me.dtEndDate.Value, awid)
Me.ReportViewer1.RefreshReport()
where, dsReport is my dataset, spCaseworkerActivityReport is a sproc and dtStartDate, dtEndDate are date picker controls on the form, used to specify the period.
Passing parameters
In SSRS, this is pretty easy, you define the parameter in the report, including the fact that its value will be selected from a list and that the data for the list is from a data source, also defined within the report. When you display the report in the browser, SSRS automatically provides the drop down list, the user makes a selection and the selected value is passed to the report.
ReportViewer doesn't do any of this for you. Instead you have to add your own text or combo boxes to the form and pass the selected parameter to ReportViewer with code.
You start by defining the parameters in the report in the usual way. Right click on top left corner of the report and select Report Parameters
In this case I am passing in a single parameter, Period, which is the period for the report, as a text string. This will be displayed in a text box on the report header. Note setting the prompt is largely pointless here - it is never displayed.
In the form's Load event, or a function called by it, add the following code
Dim p As ReportParameter
p = New ReportParameter("", ""))
ReportViewer1.LocalReport.SetParameters(New ReportParameter() {p})
Me.ReportViewer1.RefreshReport()
where in this caseis and is my formatted text string (note both arguments are strings, even if the parameter data type isn't)
I never successfully managed to pass through more than one parameter to a report, e.g.
Dim p1 As ReportParameter
Dim p2 As ReportParameter
p1 = New ReportParameter("Param1", "1"))
p2 = New ReportParameter("Param2", "2"))
ReportViewer1.LocalReport.SetParameters(New ReportParameter() {p1,p2})
Sub-reports
Sub-reports are just reports dropped into a sub-report control placed in the parent report. The issue with handling sub-reports in the ReportViewer control is in setting the data source for the sub-report.
If you don't set the data source for the sub-report, when you run the report you will see the message
Error: Subreport could not be shown
Actually, it is necessary to set the data source for each list or table in the sub-report. If the sub-report contained two tables, you would need to set two data sources.
To fix the error message, assume I have a form, frmReport which contains a ReportViewer control, ReportViewer1.
I start by adding an event handler in the Load event of the form
Private Sub frmReport_Load(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MyBase.Load
AddHandler ReportViewer1.LocalReport.SubreportProcessing, _
AddressOf Me.SetSubDataSource
Me.ReportViewer1.RefreshReport()
End Sub
Next define the Sub called by the event handler. In this example, I am assuming the report has two sub-reports, SubReport1 and Subreport2, each with a single table. For each sub-report, I fill a tableadaptor for the sub-report filtering the dataset by the ID of the row in the main report. I then add a new report data source, e.g. "DataSet_vwSubReport1"
The data source name should be copied from the table or list dataset name. Right click on top left corner of table or list and open the properties dialog
Public Sub SetSubDataSource(ByVal sender As Object, _
ByVal e As SubreportProcessingEventArgs)
If e.ReportPath = "SubReport1" Then
Me.taSubReport1.FillUsingId(Me.dsReport.vwSubReport1, Id)
e.DataSources.Add(New ReportDataSource("DataSet_vwSubReport1" _
, Me.dsReport.vwSubReport1))
ElseIf e.ReportPath = "SubReport2" Then
Me.taSubReport2.FillUsingId(Me.dsReport.vwSubReport2, Id)
e.DataSources.Add(New ReportDataSource("DataSet_vwSubReport2" _
, Me.dsReport.vwSubReport2))
End If
End Sub
This is manipulating the collection of ReportDataSource objects of the report. A ReportDataSource object has a name and a value which is the data source.
Sub-reports - filtering by main report
In the subreport we must a) define a parameter and b) set the table filter to the value passed via the parameter, e.g. for subject activities
a) define a parameter, SubjectId, by right clicking outside the drawing area to display popup
and select Report parameters
b) set the table filter by clicking on the table and right click in top left corner
Select properties to display table properties dialog and select filter tab. set up the filter
The expression to be filtered is the SubjectId field and the filter value is the value of the SubjectId parameter.
Finally, pass the parameter value from the main report to the subreport . In the main report open the parameters property of the subreport control
The parameter name must match that defined in the subreport. In this example, the value is the value of the SubjectId field.
Switching Reports in Report Viewer
The MCPD operates in both English and Dhivehi and reports can be delivered in both languages. I'll blog elsewhere about the issues of dual language, especially when one of the language uses a non-latin character set which runs right to left.
My approach to the dual language problem for reports is to use the same Windows form, with ReportViewer Control for each language setting and to reference different reports for the English and Dhivehi versions.
For each Language the underlying datasource is different, one delivering text values in English and one in Dhivehi.
In the form.designer.vb comment out the lines
'Dim ReportDataSource1 As Microsoft.Reporting.WinForms.ReportDataSource = New Microsoft.Reporting.WinForms.ReportDataSource
and
'ReportDataSource1.Name = "MCPDDataSet_vwRptE_CaseReport"
'ReportDataSource1.Value = Me.vwRptE_CaseReportBindingSource
'Me.ReportViewer1.LocalReport.DataSources.Add(ReportDataSource1)
'Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.E_CaseReport.rdlc"
In the form.vb load event or a custom New or elsewhere add lines to clear the existing report data sources
• declare a ReportDataSource variable
• set its name and value
• add this new ReportDataSource to the report
• finally we set ReportViewer Control's rdlc in code as the project name (MCPD) and report object.
Public Class CaseReport
Private mReportDataSource As New Microsoft.Reporting.WinForms.ReportDataSource
'set report and report datasource
Me.ReportViewer1.LocalReport.DataSources.Clear()
If LanguageCode = "EN" Then
mReportDataSource.Name = "MCPDDataSet_vwRptE_CaseReport"
mReportDataSource.Value = Me.vwRptE_CaseReportBindingSource
Me.ReportViewer1.LocalReport.DataSources.Add(mReportDataSource)
Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.E_CaseReport.rdlc"
Else
mReportDataSource.Name = "MCPDDataSet_vwRptD_CaseReport"
mReportDataSource.Value = Me.vwRptD_CaseReportBindingSource
Me.ReportViewer1.LocalReport.DataSources.Add(mReportDataSource)
Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.D_CaseReport.rdlc"
End If
You must add tableadaptor and binding source to the form.vb{Design]. The easiest way to do this is to select each one in turn using the reportviewer tasks panel
Selecting the report will add the datasource, tableadaptor and bindingsource controls to the designer
Gotchas
Report viewer is quite unforgiving on coding errors and quite unhelpful in indicating the source of the error. In almost all cases I have found the problem to be connected with data. Below I've listed errors I have made which have caused the ReportViewer control to not display the report.
1) If you change the size of a text field in the database, it is necessary to set the property for that field in the dataset designer for the table AND for any view using that field. If a report uses a table or view where the actual field size in the db and the property in the ds designer are different, it can lead to an error the first time the fill adaptor is called (if you step through the code in debug mode, you can see the fill method fail)
2) If the report parameter is set to the wrong type, e.g. if an integer parameter Id, is defined as string (default) instead of integer, the subreport will not display
3) if there is no select permission on the sql datasource (view, table) for the connection string, the sub report will not display. No error will be raised.
What is the ReportViewer control?
Well it is just that, a control, and it has to be placed in a form. It is in effect a container for a report. The wizard allows you to pick a report and you can change it later via the properties box for the control but there are issues with data sources after that, which I'll cover later.
If you want to specify parameter values, you have to add those to your form as text boxes or combo lists and pass through the values through to the report. That's where the fun starts.
I've found it's often easier to view the report definition as xml and edit this directly, especially if you need to change the sql definition for the data source, for instance if you've changed the data type of a field.
Filtering
There are two ways of doing this: by passing the filter as a parameter through to the report or by filtering the report dataset loaded with the form.
I never got the first method to work. I might revisit the problem now I have a greater understanding of the ReportViewer control. The control is quite unforgiving to coding errors and I've found that most problems are related to data in one way or another.
In a report which shows overdue actions for a case worker, I wanted to be able to filter the report by Case Worker. The easiest way I found of doing this was to select the case worker from a list on my form and then refill the table adaptor and refresh the report viewer.
Dim awid As Integer = cboCaseWorker.SelectedItem
Me.taCaseworkerActivityReport.Fill(Me.dsReport.spCaseworkerActivityReport, Me.dtStartDate.Value, Me.dtEndDate.Value, awid)
Me.ReportViewer1.RefreshReport()
where, dsReport is my dataset, spCaseworkerActivityReport is a sproc and dtStartDate, dtEndDate are date picker controls on the form, used to specify the period.
Passing parameters
In SSRS, this is pretty easy, you define the parameter in the report, including the fact that its value will be selected from a list and that the data for the list is from a data source, also defined within the report. When you display the report in the browser, SSRS automatically provides the drop down list, the user makes a selection and the selected value is passed to the report.
ReportViewer doesn't do any of this for you. Instead you have to add your own text or combo boxes to the form and pass the selected parameter to ReportViewer with code.
You start by defining the parameters in the report in the usual way. Right click on top left corner of the report and select Report Parameters
In this case I am passing in a single parameter, Period, which is the period for the report, as a text string. This will be displayed in a text box on the report header. Note setting the prompt is largely pointless here - it is never displayed.
In the form's Load event, or a function called by it, add the following code
Dim p As ReportParameter
p = New ReportParameter("
ReportViewer1.LocalReport.SetParameters(New ReportParameter() {p})
Me.ReportViewer1.RefreshReport()
where in this case
I never successfully managed to pass through more than one parameter to a report, e.g.
Dim p1 As ReportParameter
Dim p2 As ReportParameter
p1 = New ReportParameter("Param1", "1"))
p2 = New ReportParameter("Param2", "2"))
ReportViewer1.LocalReport.SetParameters(New ReportParameter() {p1,p2})
Sub-reports
Sub-reports are just reports dropped into a sub-report control placed in the parent report. The issue with handling sub-reports in the ReportViewer control is in setting the data source for the sub-report.
If you don't set the data source for the sub-report, when you run the report you will see the message
Error: Subreport could not be shown
Actually, it is necessary to set the data source for each list or table in the sub-report. If the sub-report contained two tables, you would need to set two data sources.
To fix the error message, assume I have a form, frmReport which contains a ReportViewer control, ReportViewer1.
I start by adding an event handler in the Load event of the form
Private Sub frmReport_Load(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MyBase.Load
AddHandler ReportViewer1.LocalReport.SubreportProcessing, _
AddressOf Me.SetSubDataSource
Me.ReportViewer1.RefreshReport()
End Sub
Next define the Sub called by the event handler. In this example, I am assuming the report has two sub-reports, SubReport1 and Subreport2, each with a single table. For each sub-report, I fill a tableadaptor for the sub-report filtering the dataset by the ID of the row in the main report. I then add a new report data source, e.g. "DataSet_vwSubReport1"
The data source name should be copied from the table or list dataset name. Right click on top left corner of table or list and open the properties dialog
Public Sub SetSubDataSource(ByVal sender As Object, _
ByVal e As SubreportProcessingEventArgs)
If e.ReportPath = "SubReport1" Then
Me.taSubReport1.FillUsingId(Me.dsReport.vwSubReport1, Id)
e.DataSources.Add(New ReportDataSource("DataSet_vwSubReport1" _
, Me.dsReport.vwSubReport1))
ElseIf e.ReportPath = "SubReport2" Then
Me.taSubReport2.FillUsingId(Me.dsReport.vwSubReport2, Id)
e.DataSources.Add(New ReportDataSource("DataSet_vwSubReport2" _
, Me.dsReport.vwSubReport2))
End If
End Sub
This is manipulating the collection of ReportDataSource objects of the report. A ReportDataSource object has a name and a value which is the data source.
Sub-reports - filtering by main report
In the subreport we must a) define a parameter and b) set the table filter to the value passed via the parameter, e.g. for subject activities
a) define a parameter, SubjectId, by right clicking outside the drawing area to display popup
and select Report parameters
b) set the table filter by clicking on the table and right click in top left corner
Select properties to display table properties dialog and select filter tab. set up the filter
The expression to be filtered is the SubjectId field and the filter value is the value of the SubjectId parameter.
Finally, pass the parameter value from the main report to the subreport . In the main report open the parameters property of the subreport control
The parameter name must match that defined in the subreport. In this example, the value is the value of the SubjectId field.
Switching Reports in Report Viewer
The MCPD operates in both English and Dhivehi and reports can be delivered in both languages. I'll blog elsewhere about the issues of dual language, especially when one of the language uses a non-latin character set which runs right to left.
My approach to the dual language problem for reports is to use the same Windows form, with ReportViewer Control for each language setting and to reference different reports for the English and Dhivehi versions.
For each Language the underlying datasource is different, one delivering text values in English and one in Dhivehi.
In the form.designer.vb comment out the lines
'Dim ReportDataSource1 As Microsoft.Reporting.WinForms.ReportDataSource = New Microsoft.Reporting.WinForms.ReportDataSource
and
'ReportDataSource1.Name = "MCPDDataSet_vwRptE_CaseReport"
'ReportDataSource1.Value = Me.vwRptE_CaseReportBindingSource
'Me.ReportViewer1.LocalReport.DataSources.Add(ReportDataSource1)
'Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.E_CaseReport.rdlc"
In the form.vb load event or a custom New or elsewhere add lines to clear the existing report data sources
• declare a ReportDataSource variable
• set its name and value
• add this new ReportDataSource to the report
• finally we set ReportViewer Control's rdlc in code as the project name (MCPD) and report object.
Public Class CaseReport
Private mReportDataSource As New Microsoft.Reporting.WinForms.ReportDataSource
'set report and report datasource
Me.ReportViewer1.LocalReport.DataSources.Clear()
If LanguageCode = "EN" Then
mReportDataSource.Name = "MCPDDataSet_vwRptE_CaseReport"
mReportDataSource.Value = Me.vwRptE_CaseReportBindingSource
Me.ReportViewer1.LocalReport.DataSources.Add(mReportDataSource)
Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.E_CaseReport.rdlc"
Else
mReportDataSource.Name = "MCPDDataSet_vwRptD_CaseReport"
mReportDataSource.Value = Me.vwRptD_CaseReportBindingSource
Me.ReportViewer1.LocalReport.DataSources.Add(mReportDataSource)
Me.ReportViewer1.LocalReport.ReportEmbeddedResource = "MCPD.D_CaseReport.rdlc"
End If
You must add tableadaptor and binding source to the form.vb{Design]. The easiest way to do this is to select each one in turn using the reportviewer tasks panel
Selecting the report will add the datasource, tableadaptor and bindingsource controls to the designer
Gotchas
Report viewer is quite unforgiving on coding errors and quite unhelpful in indicating the source of the error. In almost all cases I have found the problem to be connected with data. Below I've listed errors I have made which have caused the ReportViewer control to not display the report.
1) If you change the size of a text field in the database, it is necessary to set the property for that field in the dataset designer for the table AND for any view using that field. If a report uses a table or view where the actual field size in the db and the property in the ds designer are different, it can lead to an error the first time the fill adaptor is called (if you step through the code in debug mode, you can see the fill method fail)
2) If the report parameter is set to the wrong type, e.g. if an integer parameter Id, is defined as string (default) instead of integer, the subreport will not display
3) if there is no select permission on the sql datasource (view, table) for the connection string, the sub report will not display. No error will be raised.
Labels:
rdlc,
Report Viewer control,
Visual Basic Reports,
VS2008
Wednesday, 19 May 2010
Leaving Male' on a High
After three extremely frustrating days trying to get a connection from the social workers offices to the database, we finally achieved success this afternoon and got 4 PCs set up and some of the users logging in to see their cases migrated from their old system.
Jim and I fly home tomorrow morning but we're happy bunnies tonight.
The DGFPS Office on Sosun Magu this afternoon
Jim helping one of the social workers find a case on the new system
A message from Laura (who was the unicef project manager for the 2007 scoping exercise) to unicef today to ask if it had crashed yet! Well yes it did but only when Jim logged in, so that doesn't count.
Sunday, 16 May 2010
We're Live
We went live today - well almost. The server is now in a rack at Police HQ and we have the first admin user set up. Tomorrow and Tuesday will be spent touring the various agencies to help set up PCs and get users onto the system. What Jim and I started in October 2007 is finally realised - a national, multi-agency, child protection database for Maldives.
The server about to leave unicef in a taxi (that isn't the taxi!)
The server outside Police headquarters
Oh yes, we did have one small problem. The server rails got missed off the order to Dell!
Saturday, 15 May 2010
Translation with an accent
After a couple of months of working with the translated text for the database (screens, lookup lists, error messages, etc) along with the issue of thaana characters in ASCII and unicode (to be blogged shortly), I've learnt most of the common Thaana characters.
People here have been telling me that dhivehi is written phonetically. Well now I'm going round reading signs in Thaana (to the annoynace of my colleagues) and finding that many of the words are just the English translated into a 'pseudo-dhivehi', sometimes with humorous outcomes.
For instance, the desserts section of one restaurant menu offers arse cream (instead of ice cream) and the Bank of Ceylon's new sign reads Baynk of Ceylon, which has more than a hint of an english colonial past and plummy accents. How nayce.
The bank for toffs
Saturday, 27 March 2010
Q: When is A not A? - A: When it's an aabafili.
I received a spreadsheet from the police with a list of their offence codes in Dhivehi, for loading into the database. All of the lists are structured as a code (hidden from the user) and English and Dhivehi descriptions. The description field displayed in any drop down list depends on which language the application is set to. The user can switch betwen English and Dhivehi, although free text fields must be completed in English. We did at one stage consider allowing either language but it gets impossibly messy when the Faruma font doesn't display latin characters and Dhivehi runs right to left.
Anyway, when I started to look at this spreadsheet in some detail, I noticed a problem. The cell contents displayed as divehi but in the edit bar, i.e. what was actually stored in the cell, they were in western (latin) characters.
The contents of cell A10 are displayed as Dhivehi but input and stored as latin characters.
It's taken a few weeks to fully understand the what, why and how of this, which I think I now do. There are a number of strands to this.
Firstly is the keyboard mapping, that is the correspondance between the characters (and ASCII Codes) physical keyboard and the Dhivehi characters they produce when that language option is selected.
Next, the font we are using, P_Faruma, along with a number of other dhivehi fonts will display the unicode Thaana character set and the ascii character set as the equivalent Thaana characters. A consequence of this is that you can't dispay a latin font based language, such as english with these fonts. Some of the standard Windows TrueType fonts such as Arial, will display only the unicode character set (as far as I can see), so that ASCII 64 is always A, whereas Faruma displays ASCII 64 as the equivalent Thaana character on the dhivehi phonetic keyboard layout (selected through Regional Settings), which is ާ or Aabaafili, the double a sound and unicode character 07A7.
As I've already noted, this mapping of latin to Thaana characters isn't fixed and depends on the keyboard mapping. My first introduction to this was on a blog by Jawish Hameed, who published the java source for a transcoder he has developed to convert between latin and thaana. When I used his mappings in a VB program to do a similar job, I only got about an 80% match between the latin and Thanna in the spreadsheet data, with quite a few of the filis (vowels) being transposed. My first suspicion was that maybe this was a java thing - perhaps linux keyboards may use a different mapping. Then again it may have nothing to do with the OS but relate to mapping of the latin characters on the particular physical keyboard to the dhivehi ones. For instance my keyboard is a UK QWERTY layout, which for the main alphabet characters will be the sames as US QWERTY but would be different to say a German or French keyboard. The mapping I describe below isvalid for a US/UK QWERTY keyboard and Windows Dhivehi Phonetic (as far as I know!)
At this point I set about documenting the mapping for Dhivehi phonetic on a qwerty keyboard and once I had modified my VB code for this, I got 100% match between the latin based dhivehi displayed in the spreadsheet cells and the true unicode dhivehi. The mapping is shown below. There are a few special characters to deal with. Dhivehi reads right to left and the Dhivehi equivalents of question mark, comma and semi-colon are mirrored, as are open and closed bracket. In common with arabic, numbers run left to right whilst dates are read right to left and so formatted as yyyy mmm dd.
A big thanks must go to Jaa. The code below is mine not his but the initial mapping in his code set me on the right track.
Some other useful resources are:
http://tlt.its.psu.edu/suggestions/international/bylanguage/thaanachart.html
http://en.wikipedia.org/wiki/T%C4%81na
http://unicode.org/charts/PDF/U0780.pdf
The following is good in VB6, VB.net, Access Basic. This version handles thaana strings without any numbers, whose digits will get reversed. I have another version which does a more complete job, though one thing I have yet to crack is coding for /- used in the representation of currency values.
Function ThaanaAsciiToUnicode(ByVal strIn As String) As String
'copyright (C) 2010 Tony Bennett
'You are free to use the code for personal or commercial purposes as long as you retain the copyright notice
'The pheonetic keyboard layout mappings are:
'h' -> '1920','S' -> '1921', 'n' -> '1922', 'r' -> '1923', 'b' -> '1924', 'L' -> '1925',
'k' -> '1926', 'w' -> '1927','v' -> '1928', 'm' -> '1929',
'f' -> '1930', 'd' -> '1931', 't' -> '1932', 'l' -> '1933',
'g' -> '1934', 'N' -> '1935', 's' -> '1936', 'D' -> '1937', 'z' -> '1938', 'T' -> '1939',
'y' -> '1940', 'p' -> '1941', 'j' -> '1942', 'c' -> '1943',
'X' -> '1944', 'H' -> '1945','K' -> '1946', 'J' -> '1947',
'R' -> '1948', 'C' -> '1949', 'B' -> '1950', 'M' -> '1951',
'Y' -> '1952', 'Z' -> '1953', 'W' -> '1954', 'G' -> '1955',
'Q' -> '1956', 'V' -> '1957','a' -> '1958', 'A' -> '1959',
'i' -> '1960', 'I' -> '1961', 'u' -> '1962', 'U' -> '1963',
'e' -> '1964', 'E' -> '1965', 'o' -> '1966', 'O' -> '1967',
'q' -> '1968',
'plus the special characters
',' -> '1548', ';' -> '1563', '?' -> '1567', ')' -> '0041',
'(' -> '0040', 'Q' -> '65010'
'note that ? , ; ( ) are mirrored in thaana
Dim strOut As String
Dim c As String
Dim i As Integer
Dim j As Integer
'Dhivehi Phonetic <-> QWERTY mappings
Const AsciiChars1 = "hSnrbLkwvmfdtlgNsDzTypjcXHKJRCBMYZWGQVaAiIuUeEoOq"
'The special characters
Const AsciiChars2 = ",;?)(Q"
strOut = ""
If Len(strIn) > 0 Then
'check if any chars in unicode.
'if they are, assume this is unicode dhivehi and skip the conversion
For i = 1 To Len(strIn)
If AscW(Left(Trim(strIn), 1)) > 255 Then
AsciiToUnicode = strIn
Exit Function
End If
Next i
For i = Len(strIn) To 1 Step -1
c = Mid$(strIn, i, 1)
'need to do a case senstive instr j = InStr(1, AsciiChars1, c, vbBinaryCompare)
If j > 0 Then
strOut = strOut + ChrW(1919 + j)
Else
j = InStr(1, AsciiChars2, c, vbBinaryCompare)
If j > 0 Then
Select Case j
Case 1 ',
strOut = strOut + ChrW(1548)
Case 2 ';
strOut = strOut + ChrW(1563)
Case 3 '?
strOut = strOut + ChrW(1567)
Case 4 ')
strOut = strOut + ChrW(41)
Case 5 '(
strOut = strOut + ChrW(40)
Case 6 'Q
strOut = strOut + ChrW(65010)
End Select
Else
strOut = strOut + Mid$(strIn, i, 1)
End If
End If
Next i
End If
AsciiToUnicode = strOut
End Function
Anyway, when I started to look at this spreadsheet in some detail, I noticed a problem. The cell contents displayed as divehi but in the edit bar, i.e. what was actually stored in the cell, they were in western (latin) characters.
The contents of cell A10 are displayed as Dhivehi but input and stored as latin characters.
It's taken a few weeks to fully understand the what, why and how of this, which I think I now do. There are a number of strands to this.
Firstly is the keyboard mapping, that is the correspondance between the characters (and ASCII Codes) physical keyboard and the Dhivehi characters they produce when that language option is selected.
Next, the font we are using, P_Faruma, along with a number of other dhivehi fonts will display the unicode Thaana character set and the ascii character set as the equivalent Thaana characters. A consequence of this is that you can't dispay a latin font based language, such as english with these fonts. Some of the standard Windows TrueType fonts such as Arial, will display only the unicode character set (as far as I can see), so that ASCII 64 is always A, whereas Faruma displays ASCII 64 as the equivalent Thaana character on the dhivehi phonetic keyboard layout (selected through Regional Settings), which is ާ or Aabaafili, the double a sound and unicode character 07A7.
Dhivehi Phonetic (soft) keyboard layout
As I've already noted, this mapping of latin to Thaana characters isn't fixed and depends on the keyboard mapping. My first introduction to this was on a blog by Jawish Hameed, who published the java source for a transcoder he has developed to convert between latin and thaana. When I used his mappings in a VB program to do a similar job, I only got about an 80% match between the latin and Thanna in the spreadsheet data, with quite a few of the filis (vowels) being transposed. My first suspicion was that maybe this was a java thing - perhaps linux keyboards may use a different mapping. Then again it may have nothing to do with the OS but relate to mapping of the latin characters on the particular physical keyboard to the dhivehi ones. For instance my keyboard is a UK QWERTY layout, which for the main alphabet characters will be the sames as US QWERTY but would be different to say a German or French keyboard. The mapping I describe below isvalid for a US/UK QWERTY keyboard and Windows Dhivehi Phonetic (as far as I know!)
At this point I set about documenting the mapping for Dhivehi phonetic on a qwerty keyboard and once I had modified my VB code for this, I got 100% match between the latin based dhivehi displayed in the spreadsheet cells and the true unicode dhivehi. The mapping is shown below. There are a few special characters to deal with. Dhivehi reads right to left and the Dhivehi equivalents of question mark, comma and semi-colon are mirrored, as are open and closed bracket. In common with arabic, numbers run left to right whilst dates are read right to left and so formatted as yyyy mmm dd.
A big thanks must go to Jaa. The code below is mine not his but the initial mapping in his code set me on the right track.
Some other useful resources are:
http://tlt.its.psu.edu/suggestions/international/bylanguage/thaanachart.html
http://en.wikipedia.org/wiki/T%C4%81na
http://unicode.org/charts/PDF/U0780.pdf
The following is good in VB6, VB.net, Access Basic. This version handles thaana strings without any numbers, whose digits will get reversed. I have another version which does a more complete job, though one thing I have yet to crack is coding for /- used in the representation of currency values.
Function ThaanaAsciiToUnicode(ByVal strIn As String) As String
'copyright (C) 2010 Tony Bennett
'You are free to use the code for personal or commercial purposes as long as you retain the copyright notice
'The pheonetic keyboard layout mappings are:
'h' -> '1920','S' -> '1921', 'n' -> '1922', 'r' -> '1923', 'b' -> '1924', 'L' -> '1925',
'k' -> '1926', 'w' -> '1927','v' -> '1928', 'm' -> '1929',
'f' -> '1930', 'd' -> '1931', 't' -> '1932', 'l' -> '1933',
'g' -> '1934', 'N' -> '1935', 's' -> '1936', 'D' -> '1937', 'z' -> '1938', 'T' -> '1939',
'y' -> '1940', 'p' -> '1941', 'j' -> '1942', 'c' -> '1943',
'X' -> '1944', 'H' -> '1945','K' -> '1946', 'J' -> '1947',
'R' -> '1948', 'C' -> '1949', 'B' -> '1950', 'M' -> '1951',
'Y' -> '1952', 'Z' -> '1953', 'W' -> '1954', 'G' -> '1955',
'Q' -> '1956', 'V' -> '1957','a' -> '1958', 'A' -> '1959',
'i' -> '1960', 'I' -> '1961', 'u' -> '1962', 'U' -> '1963',
'e' -> '1964', 'E' -> '1965', 'o' -> '1966', 'O' -> '1967',
'q' -> '1968',
'plus the special characters
',' -> '1548', ';' -> '1563', '?' -> '1567', ')' -> '0041',
'(' -> '0040', 'Q' -> '65010'
'note that ? , ; ( ) are mirrored in thaana
Dim strOut As String
Dim c As String
Dim i As Integer
Dim j As Integer
'Dhivehi Phonetic <-> QWERTY mappings
Const AsciiChars1 = "hSnrbLkwvmfdtlgNsDzTypjcXHKJRCBMYZWGQVaAiIuUeEoOq"
'The special characters
Const AsciiChars2 = ",;?)(Q"
strOut = ""
If Len(strIn) > 0 Then
'check if any chars in unicode.
'if they are, assume this is unicode dhivehi and skip the conversion
For i = 1 To Len(strIn)
If AscW(Left(Trim(strIn), 1)) > 255 Then
AsciiToUnicode = strIn
Exit Function
End If
Next i
For i = Len(strIn) To 1 Step -1
c = Mid$(strIn, i, 1)
'need to do a case senstive instr j = InStr(1, AsciiChars1, c, vbBinaryCompare)
If j > 0 Then
strOut = strOut + ChrW(1919 + j)
Else
j = InStr(1, AsciiChars2, c, vbBinaryCompare)
If j > 0 Then
Select Case j
Case 1 ',
strOut = strOut + ChrW(1548)
Case 2 ';
strOut = strOut + ChrW(1563)
Case 3 '?
strOut = strOut + ChrW(1567)
Case 4 ')
strOut = strOut + ChrW(41)
Case 5 '(
strOut = strOut + ChrW(40)
Case 6 'Q
strOut = strOut + ChrW(65010)
End Select
Else
strOut = strOut + Mid$(strIn, i, 1)
End If
End If
Next i
End If
AsciiToUnicode = strOut
End Function
Labels:
ASCII,
dhivehi,
dhivehi phonetic,
Thaana,
unicode
Wednesday, 24 March 2010
SQL Collation for Dhivehi
When I originally set up the database back last September, I used the default latin collation for SQL Server, Latin1_General_CI_AI. I knew at the time I would need to consider a different collation for the dhivehi content but then forgot about this issue until the other day when I tried a search against fields holding dhivehi as unicode. My first search returned all records in the table, irrespective of the filter string. For instance searching the dhivehi firstname field for the dhivehi Bart, ބާރޓް
in a table containing records for {Bart, Homer, Lisa, Maggie, Marge} as { ބާރޓް, ހޯމަރ , މެގީ , މާރޖް , ލިސާ } using
SELECT * FROM tblPeople WHERE D_FirstName LIKE N'%ބާރޓް%'
records were returned for all the Simpson family. I was sort of expecting something like this might happen, though if I'd had to put money on it, I would have guessed I'd get no matching records.
By changing the collation on this field to one of the dhivehi set supported by SQL Server 2008, only the record for bart was returned.
The choice of collation took some consideration. The set of divehi collations in SQL Server 2008 numbers over 30 variations:
We can reduce this list by half by using the 100s in preference to the 90s. According to sql 2008 books online:
SQL Server 2008 has introduced new collations that are in full alignment with collations that Windows Server 2008 provides. These 80 new collations are denoted by *_100 version references. They provide users with the most up-to-date and linguistically accurate cultural sorting conventions.
We don't want to use the BIN collations.
Dhivehi doesn't have cases. A blog at http://www.jawish.org/blog/plugin/tag/how+to, referring to MySQL suggests a collation of utf8_unicode_ci, which isn't divehi specific but does indicate that case insensitive is good enough. Also, a blog at http://blogs.msdn.com/michkap/archive/2005/05/11/416293.aspx suggests that case sensitive can lead to problems. So case insensitive seems a good option.
The filis are stored as separate characters rather than accented versions of consonants. The choice between accents sensitive and insensitive probably makes no difference but accent insensitive seems a safe bet.
Kanatype indicates whether the data in the dataset is kanatype sensitive, which distinguishes between the two types of Japanese kana characters: Hiragana and Katakana. This is of no relevance to Dhivehi, so insensitive sounds a good option.
Width isn't an issue but insensitive seems the safest option.
This brings the choice down to
Divehi_100_CI_AI Divehi-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
To test this I created the table
CREATE TABLE [dbo].[tblPeople](
[PersonId] [int] IDENTITY(1,1) NOT NULL,
[E_Firstname] [nvarchar](40) NULL,
[E_Lastname] [nvarchar](40) NULL,
[D_FirstName] [nvarchar](40) COLLATE Divehi_100_CI_AINULL ,
[D_LastName] [nvarchar](40) COLLATE Divehi_100_CI_AI NULL)
GO
populated d_firestname with the simpsons o=in dhivehi and repeated the earlier query
SELECT * FROM tblPeople WHERE D_FirstName LIKE N'%ބާރޓް%'
which now just returns Bart's record.
Note, it is important to prefix the search string with N to indicate a unicode string.
in a table containing records for {Bart, Homer, Lisa, Maggie, Marge} as { ބާރޓް, ހޯމަރ , މެގީ , މާރޖް , ލިސާ } using
SELECT * FROM tblPeople WHERE D_FirstName LIKE N'%ބާރޓް%'
records were returned for all the Simpson family. I was sort of expecting something like this might happen, though if I'd had to put money on it, I would have guessed I'd get no matching records.
By changing the collation on this field to one of the dhivehi set supported by SQL Server 2008, only the record for bart was returned.
The choice of collation took some consideration. The set of divehi collations in SQL Server 2008 numbers over 30 variations:
We can reduce this list by half by using the 100s in preference to the 90s. According to sql 2008 books online:
SQL Server 2008 has introduced new collations that are in full alignment with collations that Windows Server 2008 provides. These 80 new collations are denoted by *_100 version references. They provide users with the most up-to-date and linguistically accurate cultural sorting conventions.
We don't want to use the BIN collations.
Dhivehi doesn't have cases. A blog at http://www.jawish.org/blog/plugin/tag/how+to, referring to MySQL suggests a collation of utf8_unicode_ci, which isn't divehi specific but does indicate that case insensitive is good enough. Also, a blog at http://blogs.msdn.com/michkap/archive/2005/05/11/416293.aspx suggests that case sensitive can lead to problems. So case insensitive seems a good option.
The filis are stored as separate characters rather than accented versions of consonants. The choice between accents sensitive and insensitive probably makes no difference but accent insensitive seems a safe bet.
Kanatype indicates whether the data in the dataset is kanatype sensitive, which distinguishes between the two types of Japanese kana characters: Hiragana and Katakana. This is of no relevance to Dhivehi, so insensitive sounds a good option.
Width isn't an issue but insensitive seems the safest option.
This brings the choice down to
Divehi_100_CI_AI Divehi-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
To test this I created the table
CREATE TABLE [dbo].[tblPeople](
[PersonId] [int] IDENTITY(1,1) NOT NULL,
[E_Firstname] [nvarchar](40) NULL,
[E_Lastname] [nvarchar](40) NULL,
[D_FirstName] [nvarchar](40) COLLATE Divehi_100_CI_AINULL ,
[D_LastName] [nvarchar](40) COLLATE Divehi_100_CI_AI NULL)
GO
populated d_firestname with the simpsons o=in dhivehi and repeated the earlier query
SELECT * FROM tblPeople WHERE D_FirstName LIKE N'%ބާރޓް%'
which now just returns Bart's record.
Note, it is important to prefix the search string with N to indicate a unicode string.
A ship the size of an island
The P &O liner Aurora arrived off Male' last night. At almost 900' long, it's bigger than the next door island of Funadhoo.
The P & O Aurora moored opposite the President's Jetty on Bodu Thakurufaanu Magu, Male'
It's not a pidgin
There's a road close to the hotel called Filigas Magu (Filigas Street). At the end of it is the only petrol (gas) station on Male'. Initially, I wondered if it was pidgin but it turns out to be just an amusing coincidence. Filigas is a kind of tree. Fili is also the name given to the 'squiggles' above and below the main characters in dhivehi, which represent the vowel sounds.
Not pidgin
Pigeon
Subscribe to:
Posts (Atom)