ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
>> Pune Budget Book 2015-16 <<GENERAL COMMENTS!! scroll down this page to learn how to do pdf to excel conversion!!
2
3
Welcome. This is an excel-ized version of the Pune Budget Book.Download the original PDF of 2015-16 Pune Budget, Standing Committee (final) from punecorporation.org
4
We hope Budget Analysts, Policy Advocates, Activists, Citizens groups in Pune can use this as a tool in their work.Going Futher:
5
Click File > Download As in the menu just below the title to download your own copy for offline use.This is possible to do for other cities' budget books also, won't cost any money but you'll have to put in your own time. Get in touch.
6
We also invite you to join us in collaboratively commenting in this document. If you're interested, please contact the author. (You can also make a copy on google-drive and share it with your own team for collaborative analysis.)We're looking for IT students / professionals to put this data in a database. Get in touch if you're interested!
7
8
9
10
INSTRUCTIONS:LINKS
11
1. Please start with the INDEX sheet. It is a detailed listing of all the sections of the Pune Municipal Corporation's Budget Book. It also has the corresponding page numbers listed.Janwani's page on PMC Budget simplification and Analysis (includes link to presentation given on 18 April 2015)
12
2. Those sections have been serialized and are laid out as separate worksheets in this file.Participatory Budget works for 2015-16 : DB-like listing for analysis, basic statistics
13
3. The Glossary sheet is an exhaustive listing of all the budget codes and their corresponding departments / categories.Wardwise extacts of budget works
14
4. A good place to start is the sheet A04. It is the Revenue Expenditures listing
15
16
META:
17
Author : Nikhil VJ, in association with CEE (Center for Environment Education, Pune) Participatory Urban Governance program.
18
Email: nikhil.js [at] gmail.com
Mobile: +91-9665831250
Blog: http://nikhilsheth.blogspot.in
19
Published on 18 April 2015, under Creative Commons ShareAlike license, CC BY-SA (ie, Free to copy, share, use) (see https://creativecommons.org/licenses/by-sa/4.0/)
20
Send an email to nikhil.js [at] gmail.com if you're interested in collaboratively flagging, commenting here.
21
22
23
DISCLAIMERTROUBLESHOOTING
24
1We have tried our best to be accurate, especially to maintain row integrity (ie, one entry shouldn't get another's figures), but cannot give a full guarantee that there won't be any errors here. Kindly cross-check the section you're working on with the original budget document, page numbers are given for each.In case you are having trouble seeing this online, try downloading an offline copy from File > Download As (using the menu in google docs, not in your browser)
25
2In case you spot any errors in this document, please let us know immediately and give exact location.See an HTML-published version of this document at the link below:
26
3Small corrections, column additions etc may occur after the publish date.Click here to see an HTML-published version of this document, should load better on older machines or mobiles
27
4You might see a column somewhere called "auto-translate".. it's an experiment, using google translate function. Automated and very inaccurate, so please don't bother with that. If you're interested in helping to translate this document to English, kindly contact the author.There is a link to the original PDF data given at the top right corner of each section so that the user can compare and check for errors
28
29
30
Quick Links:
31
Link to this document: http://tiny.cc/punebudget2015 for easy sharing
32
Get PDFs of each section, extracted from original budget book, here
33
Font converter : budget book to Unicode
34
35
36
HOW THIS WAS DONE
37
1The author was involved in a sectoral budget analysis project last year and had familiarized himself with the structure of the 2014-15 budget book, and made an improved contents page for the same for the team's use.Most of the work in putting this document together was done by a crack team of highly trained monkeys, namely:
38
2It was felt that to even start with proper analysis, it was important to get all the data into a format like excel from where it can be sorted, filtered, totaled, etc.1PDFtk4all : to break the 623-pg pdf into its sections
39
3Plan A was to try and get a copy of the budget book in the original excel formats from which the PDF is created.2pdftoexcelonline.com : online PDF to Excel converter
40
4Failing that, this was plan B.3zamzar.com : online PDF to Excel converter
41
5A week before the budget book usually gets released, the author created blank excel templates for each of the sections.4Tabula : for precision table extracting
42
6Once the PDF was published on punecorporation.org website, it was downloaded, split off into its sections, and progressively each section was entered into its templates5AutoHotKey : Automation tool
43
7A key enabler here was the font-converter which took any text copied from the budget book and faithfully converted it to Unicode Marathi, which is what you'll see in this file. So now you can copy-paste stuff out easily.6Notepad++ : ninja text editing
44
8The converter was made some months prior with the help of Mr.Anunad Singh who is part of "Scientific and Technical Hindi", a nationwide group of enthusiasts who have been creating scripts to convert legacy Devnagri fonts into Unicode. https://sites.google.com/site/technicalhindi/home/converters7Some custom-made scripts to normalize text
45
9A host of different applications were used to get the data out from the pdf while preserving the tabular format as much as possible. There were different solutions for different situations encountered.8A converter for getting the text in Unicode Marathi
46
10Once each section was populated properly, the worksheets from all the files were pulled into this one document.9And a special vote of thanks to box/column select in PDF (Ctrl+Alt+click+drag), and the ever-faithful Ctrl+C & Ctrl+V
47
11There is a link to the original PDF data given at the top right corner of each section so that the user can compare and check for errors
48
49
50
DIY PDF to TABLE conversion
51
1If you want to do multiple pages at a go, then you'll first need to extract just those pages from the bigger document. For that, you can use:
52
http://www.splitpdf.com/
53
54
2Then, this site will convert that pdf to excel for you.. it was pretty accurate for me:
55
http://pdftoexcelonline.com/
56
If not that, then my second choice is:
57
http://www.zamzar.com
58
You might have to clean up the converted excel, re-align stuff, etc.
59
60
3In case it's just a few pages, or if you need a specific part of the page only, then better to use Tabula:
61
http://tabula.technology
62
It's a portable software that runs on your computer, works through your browser. You visuzally draw a box on the page and it converts that to table. Either copy it to clipboard and then paste in excel using import text wizard, or save it as CSV and then load it in excel.
63
64
-------------
65
66
4Now, the converted file will have weird characters. Select all the cells, copy, and paste it here:
67
http://nikhilsheth.techydudes.net/files/PuneBudget_to_Unicode_Converter.html
68
Convert, copy, and in excel, select just the topmost cell of your selection and press ctrl+v. all the cells you originally copied out from should be replaced.
69
70
In case this was in ShreeDev font, try this:
71
http://nikhilsheth.techydudes.net/files/PMPML-Shreedev-to-Unicode-Converter.html
72
73
And if none of them are working, then send me the file.. i'll try to trace which converter it needs.
74
75
4.1The text might get garbled up in the pdf-to-excel conversion. If that happens (happened a lot here), then you have to copy-paste the text from the pdf. Press and hold the Ctrl, Alt buttons and draw a box (box-select) in the pdf-reader, that should let you select all the text in a column. First paste that in a simple text editor like Notepad.
76
4.2Where there was more than one line of text in a cell, you'll have to "unwrap" them back to one line. (switch off wrapping by Format > Word Wrap). Where there were blank cells, you'll have to press enter, enter to space out the lines. Once you're sure it's exactly as it was in the PDF, then you can safely paste it back to the excel. Use the adjoining cells to make sure the stuff is well aligned.
77
78
5For the figures, if they have commas like this: 52,50,000 then they'll be stored in the excel as text, not as numbers. To make them as numbers, select all the cells having the numbers, copy, and paste here:
79
http://nikhilsheth.techydudes.net/files/zapper.html
80
81
..and press "zap commas". Copy the output and paste it back to the excel, with the first cell of your copied block selected.
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100