A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | >> Pune Budget Book 2015-16 << | GENERAL COMMENTS | !! scroll down this page to learn how to do pdf to excel conversion!! | |||||||||||||||||||||||||
2 | ||||||||||||||||||||||||||||
3 | Welcome. This is an excel-ized version of the Pune Budget Book. | Download the original PDF of 2015-16 Pune Budget, Standing Committee (final) from punecorporation.org | ||||||||||||||||||||||||||
4 | We hope Budget Analysts, Policy Advocates, Activists, Citizens groups in Pune can use this as a tool in their work. | Going Futher: | ||||||||||||||||||||||||||
5 | Click File > Download As in the menu just below the title to download your own copy for offline use. | This is possible to do for other cities' budget books also, won't cost any money but you'll have to put in your own time. Get in touch. | ||||||||||||||||||||||||||
6 | We also invite you to join us in collaboratively commenting in this document. If you're interested, please contact the author. (You can also make a copy on google-drive and share it with your own team for collaborative analysis.) | We're looking for IT students / professionals to put this data in a database. Get in touch if you're interested! | ||||||||||||||||||||||||||
7 | ||||||||||||||||||||||||||||
8 | ||||||||||||||||||||||||||||
9 | ||||||||||||||||||||||||||||
10 | INSTRUCTIONS: | LINKS | ||||||||||||||||||||||||||
11 | 1. Please start with the INDEX sheet. It is a detailed listing of all the sections of the Pune Municipal Corporation's Budget Book. It also has the corresponding page numbers listed. | Janwani's page on PMC Budget simplification and Analysis (includes link to presentation given on 18 April 2015) | ||||||||||||||||||||||||||
12 | 2. Those sections have been serialized and are laid out as separate worksheets in this file. | Participatory Budget works for 2015-16 : DB-like listing for analysis, basic statistics | ||||||||||||||||||||||||||
13 | 3. The Glossary sheet is an exhaustive listing of all the budget codes and their corresponding departments / categories. | Wardwise extacts of budget works | ||||||||||||||||||||||||||
14 | 4. A good place to start is the sheet A04. It is the Revenue Expenditures listing | |||||||||||||||||||||||||||
15 | ||||||||||||||||||||||||||||
16 | META: | |||||||||||||||||||||||||||
17 | Author : Nikhil VJ, in association with CEE (Center for Environment Education, Pune) Participatory Urban Governance program. | |||||||||||||||||||||||||||
18 | Email: nikhil.js [at] gmail.com Mobile: +91-9665831250 Blog: http://nikhilsheth.blogspot.in | |||||||||||||||||||||||||||
19 | Published on 18 April 2015, under Creative Commons ShareAlike license, CC BY-SA (ie, Free to copy, share, use) (see https://creativecommons.org/licenses/by-sa/4.0/) | |||||||||||||||||||||||||||
20 | Send an email to nikhil.js [at] gmail.com if you're interested in collaboratively flagging, commenting here. | |||||||||||||||||||||||||||
21 | ||||||||||||||||||||||||||||
22 | ||||||||||||||||||||||||||||
23 | DISCLAIMER | TROUBLESHOOTING | ||||||||||||||||||||||||||
24 | 1 | We have tried our best to be accurate, especially to maintain row integrity (ie, one entry shouldn't get another's figures), but cannot give a full guarantee that there won't be any errors here. Kindly cross-check the section you're working on with the original budget document, page numbers are given for each. | In case you are having trouble seeing this online, try downloading an offline copy from File > Download As (using the menu in google docs, not in your browser) | |||||||||||||||||||||||||
25 | 2 | In case you spot any errors in this document, please let us know immediately and give exact location. | See an HTML-published version of this document at the link below: | |||||||||||||||||||||||||
26 | 3 | Small corrections, column additions etc may occur after the publish date. | Click here to see an HTML-published version of this document, should load better on older machines or mobiles | |||||||||||||||||||||||||
27 | 4 | You might see a column somewhere called "auto-translate".. it's an experiment, using google translate function. Automated and very inaccurate, so please don't bother with that. If you're interested in helping to translate this document to English, kindly contact the author. | There is a link to the original PDF data given at the top right corner of each section so that the user can compare and check for errors | |||||||||||||||||||||||||
28 | ||||||||||||||||||||||||||||
29 | ||||||||||||||||||||||||||||
30 | Quick Links: | |||||||||||||||||||||||||||
31 | Link to this document: http://tiny.cc/punebudget2015 for easy sharing | |||||||||||||||||||||||||||
32 | Get PDFs of each section, extracted from original budget book, here | |||||||||||||||||||||||||||
33 | Font converter : budget book to Unicode | |||||||||||||||||||||||||||
34 | ||||||||||||||||||||||||||||
35 | ||||||||||||||||||||||||||||
36 | HOW THIS WAS DONE | |||||||||||||||||||||||||||
37 | 1 | The author was involved in a sectoral budget analysis project last year and had familiarized himself with the structure of the 2014-15 budget book, and made an improved contents page for the same for the team's use. | Most of the work in putting this document together was done by a crack team of highly trained monkeys, namely: | |||||||||||||||||||||||||
38 | 2 | It was felt that to even start with proper analysis, it was important to get all the data into a format like excel from where it can be sorted, filtered, totaled, etc. | 1 | PDFtk4all : to break the 623-pg pdf into its sections | ||||||||||||||||||||||||
39 | 3 | Plan A was to try and get a copy of the budget book in the original excel formats from which the PDF is created. | 2 | pdftoexcelonline.com : online PDF to Excel converter | ||||||||||||||||||||||||
40 | 4 | Failing that, this was plan B. | 3 | zamzar.com : online PDF to Excel converter | ||||||||||||||||||||||||
41 | 5 | A week before the budget book usually gets released, the author created blank excel templates for each of the sections. | 4 | Tabula : for precision table extracting | ||||||||||||||||||||||||
42 | 6 | Once the PDF was published on punecorporation.org website, it was downloaded, split off into its sections, and progressively each section was entered into its templates | 5 | AutoHotKey : Automation tool | ||||||||||||||||||||||||
43 | 7 | A key enabler here was the font-converter which took any text copied from the budget book and faithfully converted it to Unicode Marathi, which is what you'll see in this file. So now you can copy-paste stuff out easily. | 6 | Notepad++ : ninja text editing | ||||||||||||||||||||||||
44 | 8 | The converter was made some months prior with the help of Mr.Anunad Singh who is part of "Scientific and Technical Hindi", a nationwide group of enthusiasts who have been creating scripts to convert legacy Devnagri fonts into Unicode. https://sites.google.com/site/technicalhindi/home/converters | 7 | Some custom-made scripts to normalize text | ||||||||||||||||||||||||
45 | 9 | A host of different applications were used to get the data out from the pdf while preserving the tabular format as much as possible. There were different solutions for different situations encountered. | 8 | A converter for getting the text in Unicode Marathi | ||||||||||||||||||||||||
46 | 10 | Once each section was populated properly, the worksheets from all the files were pulled into this one document. | 9 | And a special vote of thanks to box/column select in PDF (Ctrl+Alt+click+drag), and the ever-faithful Ctrl+C & Ctrl+V | ||||||||||||||||||||||||
47 | 11 | There is a link to the original PDF data given at the top right corner of each section so that the user can compare and check for errors | ||||||||||||||||||||||||||
48 | ||||||||||||||||||||||||||||
49 | ||||||||||||||||||||||||||||
50 | DIY PDF to TABLE conversion | |||||||||||||||||||||||||||
51 | 1 | If you want to do multiple pages at a go, then you'll first need to extract just those pages from the bigger document. For that, you can use: | ||||||||||||||||||||||||||
52 | http://www.splitpdf.com/ | |||||||||||||||||||||||||||
53 | ||||||||||||||||||||||||||||
54 | 2 | Then, this site will convert that pdf to excel for you.. it was pretty accurate for me: | ||||||||||||||||||||||||||
55 | http://pdftoexcelonline.com/ | |||||||||||||||||||||||||||
56 | If not that, then my second choice is: | |||||||||||||||||||||||||||
57 | http://www.zamzar.com | |||||||||||||||||||||||||||
58 | You might have to clean up the converted excel, re-align stuff, etc. | |||||||||||||||||||||||||||
59 | ||||||||||||||||||||||||||||
60 | 3 | In case it's just a few pages, or if you need a specific part of the page only, then better to use Tabula: | ||||||||||||||||||||||||||
61 | http://tabula.technology | |||||||||||||||||||||||||||
62 | It's a portable software that runs on your computer, works through your browser. You visuzally draw a box on the page and it converts that to table. Either copy it to clipboard and then paste in excel using import text wizard, or save it as CSV and then load it in excel. | |||||||||||||||||||||||||||
63 | ||||||||||||||||||||||||||||
64 | ------------- | |||||||||||||||||||||||||||
65 | ||||||||||||||||||||||||||||
66 | 4 | Now, the converted file will have weird characters. Select all the cells, copy, and paste it here: | ||||||||||||||||||||||||||
67 | http://nikhilsheth.techydudes.net/files/PuneBudget_to_Unicode_Converter.html | |||||||||||||||||||||||||||
68 | Convert, copy, and in excel, select just the topmost cell of your selection and press ctrl+v. all the cells you originally copied out from should be replaced. | |||||||||||||||||||||||||||
69 | ||||||||||||||||||||||||||||
70 | In case this was in ShreeDev font, try this: | |||||||||||||||||||||||||||
71 | http://nikhilsheth.techydudes.net/files/PMPML-Shreedev-to-Unicode-Converter.html | |||||||||||||||||||||||||||
72 | ||||||||||||||||||||||||||||
73 | And if none of them are working, then send me the file.. i'll try to trace which converter it needs. | |||||||||||||||||||||||||||
74 | ||||||||||||||||||||||||||||
75 | 4.1 | The text might get garbled up in the pdf-to-excel conversion. If that happens (happened a lot here), then you have to copy-paste the text from the pdf. Press and hold the Ctrl, Alt buttons and draw a box (box-select) in the pdf-reader, that should let you select all the text in a column. First paste that in a simple text editor like Notepad. | ||||||||||||||||||||||||||
76 | 4.2 | Where there was more than one line of text in a cell, you'll have to "unwrap" them back to one line. (switch off wrapping by Format > Word Wrap). Where there were blank cells, you'll have to press enter, enter to space out the lines. Once you're sure it's exactly as it was in the PDF, then you can safely paste it back to the excel. Use the adjoining cells to make sure the stuff is well aligned. | ||||||||||||||||||||||||||
77 | ||||||||||||||||||||||||||||
78 | 5 | For the figures, if they have commas like this: 52,50,000 then they'll be stored in the excel as text, not as numbers. To make them as numbers, select all the cells having the numbers, copy, and paste here: | ||||||||||||||||||||||||||
79 | http://nikhilsheth.techydudes.net/files/zapper.html | |||||||||||||||||||||||||||
80 | ||||||||||||||||||||||||||||
81 | ..and press "zap commas". Copy the output and paste it back to the excel, with the first cell of your copied block selected. | |||||||||||||||||||||||||||
82 | ||||||||||||||||||||||||||||
83 | ||||||||||||||||||||||||||||
84 | ||||||||||||||||||||||||||||
85 | ||||||||||||||||||||||||||||
86 | ||||||||||||||||||||||||||||
87 | ||||||||||||||||||||||||||||
88 | ||||||||||||||||||||||||||||
89 | ||||||||||||||||||||||||||||
90 | ||||||||||||||||||||||||||||
91 | ||||||||||||||||||||||||||||
92 | ||||||||||||||||||||||||||||
93 | ||||||||||||||||||||||||||||
94 | ||||||||||||||||||||||||||||
95 | ||||||||||||||||||||||||||||
96 | ||||||||||||||||||||||||||||
97 | ||||||||||||||||||||||||||||
98 | ||||||||||||||||||||||||||||
99 | ||||||||||||||||||||||||||||
100 |