To be quite frank, what you need are man hours. There are many tools out there that can help you finding corners or edges to start working on, but you can do the same with a coin toss, no tool will significantly reduce the amount of man hours that will have to be spent fixing, re-factoring and re-organizing. Take a good loooooong look, devise a simple strategy and then jump in somewhere. From personal experience, add lots of assertions as you go.
We see new languages all the time, most of them don't stand the test of time or are supplanted by others as time goes on -- but at this point there is a ton of industry experience in supporting a standardized Virtual Machine language / architecture / (whatever you want to call it). There's Sun/Oracle's JVM, along with several other implementations of this VM interface. The JVM will support any number of languages that target it. Microsoft has done this as well with their CLR (Common Language Runtime) as part of.NET. The web already has the "Object Model / Standard Library" end of things, the DOM, it just needs a virtual machine standard and then web developers could bring, port or invent the language of their choice.
I would start by finding a chair. Seriously. The only way to read code is by reading code. The more code you read the more you realize the futility of any method other than -- sitting in a chair and reading the code.
This is the biggest issue, not the direct performance hit, but the memory overhead. A pv_entry on 64-bit DragonFly is 80 bytes. 60 clients sharing a 6gb segment is not entirely unrealistic, this is like 7GB of pv_entry overhead.
See page 3, the shm_use_phys #'s vs the other for FreeBSD. DragonFly does not see this hit because we _excessively_ cache pv entries (it would be nice if we could dial this back).
The performance difference here is identical to what this patch will cause.
I just posted this to the blog, but I will repeat it here --
There is a very good reason we OS vendors do not ship with SysV default limits high enough to run a serious PostgreSQL database. There is very little software that uses SysV in any serious way other than PostgreSQL and there is a fixed overhead to increasing those limits. You end up wasting RAM for all the users who do not need the limits to be that high. That said, you are late to the party here, vendors have finally decided that the fixed overheads are low enough relative to modern RAM sizes that the defaults can be raised quite high, DragonFly BSD has shipped with greatly increased limits for a year or so and I believe FreeBSD also.
There is a serious problem with this patch on BSD kernels. All of the BSD sysv implementations have a shm_use_phys optimization which forces the kernel to wire up memory pages used to back SysV segments. This increases performance by not requiring the allocation of pv entries for these pages and also reduces memory pressure. Most serious users of PostgreSQL on BSD platforms use this well-documented optimization. After switching to 9.3, large and well optimized Pg installations that previously ran well in memory will be forced into swap because of the pv entry overhead.
Because when the leadership of an open source project doesn't do what you want, nobody is stopping you from doing it yourself. Someone fork it, i'll use it.
To be quite frank, what you need are man hours. There are many tools out there that can help you finding corners or edges to start working on, but you can do the same with a coin toss, no tool will significantly reduce the amount of man hours that will have to be spent fixing, re-factoring and re-organizing. Take a good loooooong look, devise a simple strategy and then jump in somewhere. From personal experience, add lots of assertions as you go.
We see new languages all the time, most of them don't stand the test of time or are supplanted by others as time goes on -- but at this point there is a ton of industry experience in supporting a standardized Virtual Machine language / architecture / (whatever you want to call it). There's Sun/Oracle's JVM, along with several other implementations of this VM interface. The JVM will support any number of languages that target it. Microsoft has done this as well with their CLR (Common Language Runtime) as part of .NET. The web already has the "Object Model / Standard Library" end of things, the DOM, it just needs a virtual machine standard and then web developers could bring, port or invent the language of their choice.
I will pay for a piece of Mars. Git'r Done.
I would start by finding a chair. Seriously. The only way to read code is by reading code. The more code you read the more you realize the futility of any method other than -- sitting in a chair and reading the code.
This is the biggest issue, not the direct performance hit, but the memory overhead. A pv_entry on 64-bit DragonFly is 80 bytes. 60 clients sharing a 6gb segment is not entirely unrealistic, this is like 7GB of pv_entry overhead.
http://dl.wolfpond.org/benchs/Pg-benchmarks.2011-11.pdf
See page 3, the shm_use_phys #'s vs the other for FreeBSD. DragonFly does not see this hit because we _excessively_ cache pv entries (it would be nice if we could dial this back).
The performance difference here is identical to what this patch will cause.
I just posted this to the blog, but I will repeat it here --
There is a very good reason we OS vendors do not ship with SysV default limits high enough to run a serious PostgreSQL database. There is very little software that uses SysV in any serious way other than PostgreSQL and there is a fixed overhead to increasing those limits. You end up wasting RAM for all the users who do not need the limits to be that high. That said, you are late to the party here, vendors have finally decided that the fixed overheads are low enough relative to modern RAM sizes that the defaults can be raised quite high, DragonFly BSD has shipped with greatly increased limits for a year or so and I believe FreeBSD also.
There is a serious problem with this patch on BSD kernels. All of the BSD sysv implementations have a shm_use_phys optimization which forces the kernel to wire up memory pages used to back SysV segments. This increases performance by not requiring the allocation of pv entries for these pages and also reduces memory pressure. Most serious users of PostgreSQL on BSD platforms use this well-documented optimization. After switching to 9.3, large and well optimized Pg installations that previously ran well in memory will be forced into swap because of the pv entry overhead.
This would be insightful and all -- except that it isn't -- because DragonFly BSD uses the same x86-64 calling conventions as Linux.
Because when the leadership of an open source project doesn't do what you want, nobody is stopping you from doing it yourself. Someone fork it, i'll use it.