spitzak · Slashdot Mirror

Re:Subpixel and anaglyphs; distance fields on Glyphy: High Quality Glyph Rendering Using OpenGL ES2 Shaders · 2014-01-15 11:15 · Score: 1

Yes, this is an improvement on signed distance fields. If I understand it right, it is not the distance to the nearest point, but a definition of the nearest circular arc that is stored in each texture pixel. This seems to preserve corners and thin stems. Though it sounds complex, he in fact has to store more than one arc per pixel (as the closest one varies depending on the position) and it looks like it has to define actual arcs, not circles, which I would imagine complicates the shader greatly.

Re:Video of SDF rendering on Glyphy: High Quality Glyph Rendering Using OpenGL ES2 Shaders · 2014-01-15 11:08 · Score: 2

That that video shows the first type of signed distance implementation, while glyphy is a new type of storage.

He shows a texture for this older version quickly at the start of his video. In that version the distance from the center of the pixel to the nearest edge is stored in each pixel (one number per pixel). This has been done for years, btw, and is not new.

In glyphy the actual definition of the nearest circular arc is stored in each pixel (either 3 or 5 numbers per pixel depending on whether a circle or arc segment is used, that is unclear from the presentation. It also stores more than one circular arc as the closest one differs for each point in the pixel, this is also pretty unclear how that is done). So the texture is bigger (maybe, perhaps it could be lower resolution for equal quality). But it gets rid of a lot of artifacts seen in that video, in particular sharp corners are preserved.

A problem is that his shader is too complex and hits bugs in all the current GLSL compilers.

Re:Bike helmet? on Building a Better Bike Helmet Out of Paper · 2014-01-13 12:41 · Score: 1

I was required to wear a helmet while taking a track driving course, but was driving my own car and it did not have any of these other safety features. So somebody felt the helmet alone was worth it even without the 6-point harness, etc.

Re:9.1 on Windows 9 Already? Apparently, Yes. · 2014-01-13 07:39 · Score: 2

Why are the response to swipe and hot corners part of the *driver* rather than the system? Really don't like the sounds of that, as I certainly have had bad experience with vendor-supplied drivers (especially for printers) and was under the impression Microsoft was trying to stamp out the worst offenders.

Re:Because it works. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-11 13:52 · Score: 1

The equality test was probably a poor example. You are right that Python will return false if the types are different.

Passing a string to a function taking Unicode in Python2.x is the problem: it will throw exceptions unexpectedly on bad UTF-8. It does sound like Python3.x throws an exception *always* which I suppose is better, because my code will not work unless I call my translate-without-an-exception function. Still really annoying.

You can indeed search for 'A' without worrying about the encoding. It is the value 65 in every encoding that anybody will ever encounter on a modern system. That is why I want it to work without having to think about encodings.

And I'm sorry, pretending that certain byte arrangements are "not UTF-8" and therefore will just magically not appear because the string "is UTF-8" is wishful thinking and showing a complete ignorance of physical reality. People store NaN in floating point and don't say "that won't happen". They should learn to do this with text.

Re:python sucks on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 14:50 · Score: 1

you can still do something like 'fooæbar\x88\n'.encode('utf-8').

That does not work as it will turn the \x88 into a two-byte encoding of U+0088.

I understand that bytes.find() takes bytes and that bytes.find('foo') is an error. However having to add/remove the 'b' as code is reused it rather painful, and makes duck typing a lot more unpredictable. In addition it sounds like I cannot put unicode (such as '\u1234') into a bytes constant, making portability even more annoying as I have to hand-encode it into UTF-8 (having had to do this for Windows C source code I can tell you that is very error-prone).

Although based on Python2.x, our C++ code with a Python api does exactly what you propose. All string arguments are examined and if they are what Python calls "unicode" rather than string/bytes then our own converter to UTF-8 is run. All strings returned to Python are using the "bytes" type (which is the same as str in 2.x). The main problem is that this return type will fail in Python3.0 as it is not automatically converted to Unicode, and we cannot convert to Unicode as we need to be able to return invalid UTF-8.

Re:Because it works. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 14:42 · Score: 1

My actual desire is to be able to use UTF-8, yet retain the ability to name a file that has an invalid UTF-8 name. Currenlty the Python3 libraries do not provide a function that does this. This is incredibly inconvienent, and I am pretty much forced to do what everybody does, which is abandon use of UTF-8 and Unicode, using raw bytes for everything.

I don't think b'...' works correctly for non-ASCII characters. I want the byte sequence for UTF-8. In particular I want it if I type in '\u1234' (note that '\x89' should still produce the byte for compatibility and to retain the ability to make invalid UTF-8, if I want the unicode character U+89 then I have to put in '\u0089'). In addition it would help considerably that if the source code is UTF-8 then b'...' will copy the bytes between the quotes exactly, whether or not they are valid UTF-8. This will allow code written using legacy encodings to continue to work.

BTW Python3.3 and up have abandoned the 2.x "unicode" object entirely. Instead they are using a union of ASCII-or-UTF-8, UCS-2, and UTF-32, transparently converting between these depending on what requests were done and what the ranges of characters are. This is getting a lot closer to what is needed but I would like the following added:

1. VITAL: add the ability to store invalid UTF-8. It does not throw an exception unless an attempt is made to actually access unicode "characters". In particular it does not throw an exception if written to a byte stream.

2. Add ability to index the UTF-8 code points, thus allowing looking at code points without throwing exceptions

2a. Probably add the ability to index UTF-16 by reusing the UCS-2 for this just like it uses the ASCII for UTF-8. I am suggesting this because I suspect people working with UTF-16, such as the hordes of Windows programmers will have the same problems I am having with UTF-8.

3. String constants are stored in the UTF-8 form if they contain UTF-8 encoding errors. \xNN and \OOO escapes can be used to write UTF-8 encoding errors in source code that is itself not in UTF-8. (UTF-16 errors are a bit easier as \uNNNN can be used for these without conflict).

In addition a number of libraries need a bit of patching so they don't throw exceptions. For instance on Windows asking if a file exits where the filename is a bad UTF-8 string should return false, not throw an exception. An easy way is to make conversion never throw an exception by converting the bad bytes to 0xDCxx (which is what Python 3 does with command line arguments) but that could be dangerous.

Re:python sucks on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 13:40 · Score: 1

That syntax for the quoted string does not work for UTF-8, and requiring the b for the find argument is broken because I cannot reuse code.

Here is what I want (though I would prefer that 'b' be automatic and 'u' mean you want a "unicode" thing):

s = b'fooæbar\x88\n'

s now contains the UTF-8 encoding of æ and a newline at the end and also an invalid bit of UTF-8 in a byte with the value 0x88.

s.find('a')
s.find('\x88')

These work to find the byte offset of the given match

s.find('æ')

This is a search for a 2-byte string containing the UTF-8 of æ.

Yes I meant Guido, not Theo. Sorry.

Re:Because it works. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 13:35 · Score: 1

Good points about the print syntax, though I think handling space the way I said would get a large number of these. Maybe it could be done for the command prompt only, which would help.

The reason 'foo == "constant"' can throw an exception is that it will try to convert foo to "unicode" and can throw an exception on any encoding errors. I don't want that, I want it to return false. Note that Python2.x does this as well if you put 'u' in front of the constant, and it considers every byte with the high bit set an error, so it is perhaps worse...

As you point out, "bytes" do not have isalpha() or a lot of other stuff. Therefore they are useless. I cannot do even the most trivial text operations on arbitrary data that contains "mostly" text (ie perhaps a UTF-8 string that got one byte of garbage added to the
end...).

I should be able to search a 'bytes' for 'A' without thinking about encodings. I should be able to do at least isspace(). isalnum() might be nice as well. All of this is incredibly difficult, or you will write programs that throw unexpected exceptions and thus you have made a nice denial of service bug.

They can fix this. Add some isalpha()/etc stuff to "bytes" so that I can use them (just assume it is ASCII and for bytes with the high bit set all these tests return the same value, for instance true for isprint()). Let me search for a string constant as long as it is an ascii letter. If I compare bytes to unicode, convert the unicode TO UTF-8 and do the comparison, not the other way around, you idiots. And add a way to turn a quoted string containing Unicode directly into a bytes object, ideally if the source code is itself UTF-8 then the exact bytes, including errors, between the quotes are placed into the constant.

They should be required to make a file containing Unicode text and exactly one non-UTF-8 byte and all apis must be required to read and write this file, allowing all readable text to be accessed, and obvious methods of copying it must preserver the bytes in it.

Re:The Zen doesn't work on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 09:08 · Score: 1

It's the dropping of support for text in 8-bit strings that is stopping us.

I tried for awhile but we were forced to make all our python api's take and return "bytes" objects, which do not print correctly. In Python2.x they will at least print if there are no bytes with the high bit set, which gets us some reasonable subset.

Re:It's because Python 3 is broken. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 09:03 · Score: 1

It does seem to me the python parser could treat the expression "AB" as "A(B)". Then "print a" would turn into "print(a)" and 'print "Hello %s"%name"' would turn into 'print("Hello %s"%name)' (note you don't need to call format()).

It also would mean that you could type "help foo" instead of "help(foo)" and probably a lot of other useful things.

Not sure if this would violate some existing valid syntax of Python, and there probably are details missing (one obvious one is that "A (B)" should not turn into "A((B))". But it does seem like a valid idea and it would be much more back-compatible.

Re:It's because Python 3 is broken. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 08:58 · Score: 1

It you have a sequence of bytes that is not to be interpreted as text, then it's not a string

No, it is too a string. It may even be 99.99999% valid UTF-8 encodings of "characters". I do not want the fact that a single byte is wrong to break my code, which does not interpret the contents of the string at all, except maybe to look for spaces. And I may require this, for instance if I want to write a program that finds and renames all files with invalid UTF-8 in their filenames. That can't be done if the api to name files will throw a goddamn exception when I give it an invalid name!

Somehow Unicode turns otherwise intelligent people like you into drooling morons. Think a little. Think about "words" for instance: software has been copying "words" without mangling them or splitting them and even doing complex operations such as arranging them into justified paragraphs or comparing them to other "words", despite the horrible fact that that software fails to spell-check the words! Why, that should be impossible!

Python 3's handling of strings is broken. Sorry.

Re:python sucks on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 08:32 · Score: 1

Bull. Python is the one thinking "ASCII is the only text encoding out there"

Anybody actually paying attention in 2.0 would have changed the default encoding to UTF-8 with non-exception-throwing handling of errors (ie turning them into characters). The fact that a byte stream with any bytes with the high bit set will throw an exception at quite unexpected times and doing the most trivial thing (such as comparing to a "unicode" string constant) is bad and entirely due to Theo's insistence that ASCII is the only thing that exists.

3.0 is marginally better in that it appears to at least acknowledge that UTF-8 exists, but makes a lot of other things worse. In particular now you cannot do any text manipulation without first doing something to detect UTF-8 encoding errors. I can't even search for an 'A' which 2.0 allowed me to do.

This is a disgusting mess. And there is some problem with text in that it causes otherwise brilliant people, such as Theo, to turn into complete morons. I think the problem is that too many people's first interesting piece of software was learning "you can capitalize by doing string[n] = string[n] + 'A' - 'a'". This has then perpetuated the idea that somehow a "character" is important, that you must be able to get a "character" from any integer index, and that they can be substituted for each other. Try using english words (which are much more useful to humans than "characters") and you will see how pointless these assumptions are. I can assure you that software has managed to work with words, doing complex stuff like formatting them into paragraphs without losing a single one or breaking it in the middle, without these limitations. The same applies to "characters".

Re:Because it works. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 08:17 · Score: 1

The "length" of a Unicode string should be an indication of how much memory it takes to store it in a desired encoding.

The belief that somehow this thing called a "character" is important needs to be stamped out. Look up combining characters and compostions/decomposition and how other languages treat characters to realize this. Any code that thinks it can do something with a single Unicode code point is by definition broken and it would be nice if the language was designed to make this not so tempting or easy to use, especially because it slows down string processing and causes exceptions from invalid text to be thrown at unexpected and useless times, causing denial of service bugs. I would have indexing as string return a code unit (because code units are useful as they are data that is streamed and because it won't break code that is just looking for some ASCII like spaces and treating all other "characters" the same).

If you want a "character" Python should force use of an ITERATOR. There can be many different iterators (such as whether you want precomposed or decomposed "characters"). They can return an indication of error including exactly the byte sequence that is considered an error, without throwing exceptions and without lossy conversion of errors to something that can't be distinguished from valid text.

Python has a chance to get us out of the dark ages of text but not as long as brainless people keep saying "character"!!!

Re:Because it works. on Why Do Projects Continue To Support Old Python Releases? · 2014-01-10 08:08 · Score: 2

On the parenthesis: It looks like it would have been possible to make the parser (or at least the command line input) parse "AB" into the same result "A(B)". Perhaps this conflicts with some part of Python syntax somewhere but I don't see it right now. This would then allow print to work as before, and would also make things like "help foo" typed on the command line work.

Strings however are a serious impediment to 3.0. You are blissfully ignoring the huge problem: a range of bytes can contain any arrangement, including invalid UTF-8. Just because you read them into a "string" does not magically make this go away. Python's insistence on either converting to UTF-16 or making it impossible to use character functions really makes it impossible to do string manipulation with data that comes from arbitrary sources. For instance the simple act of 'foo == "constant"' can now throw an exception if foo contains an encoding error! That is so bogus I don't know what to do.

Strings should be 8 bits and contain every single possible arrangement of bytes, whether it is valid in an "encoding" or not. And if you really want to get a "character" (which in fact you almost NEVER are interested in, instead you are interested in subsections of the string) then programmers should use *ITERATORS*. Then you can use the correct iterator for the encoding you want, such as whether you want precomposed or decomposed Unicode code points, or UTF-16 code units, or you know this string is ISO-8859-1, and it can clearly return indicators for invalid encodings and never ever throw an exception!

Until Theo gets this through his thick skull (he seems to have a total blockage of understanding that text is a stream and interpretation should be DEFERRED) then it will be impossible to switch to Python 3.0. Python 2.0 at least preserves an arbitrary array of bytes in a string object, though I have to be really f**King careful never to use "unicode" anywhere because it may throw an exception when I don't want it.

Re:Backwardness of KDE continues on KDE Releases Frameworks 5 Tech Preview · 2014-01-08 07:21 · Score: 3, Interesting

Your "script-based approach" eventually calls something. Ie if your script has "put a movie here" it does not magically work without a *lot* of code that deals with actually getting a movie from where it is stored, converting it's storage into a form that your eyes can see, and placing it on the screen with proper sync. This stuff does not magically exist because you have a "script-based approach". This sounds a lot like it is providing these things.

Writing Qt apps is getting pretty close to scripting, and in many cases you are running parsers and interpreters. Often this will reduce code size and thus increase speed over C++. Also lots of Qt is written in Python. I don't know much about it but I think QtQuick is pretty much another interpreter and they are planning on moving everything to it.

Re:KODAK is actually a good example. on The Internet's Network Efficiencies Are Destroying the Middle Class · 2014-01-07 13:10 · Score: 1

I'm sorry that is not an explanation because you are making an assumption about what is "fair" with no backing. You can make equally good arguments that taking the same amount of money from everybody is "fair" or leaving everybody the same amount of remaining money is "fair", thus covering a huge range of extremes.

You can also make all kinds of curves that are not straight lines, for instance you have proposed two such curves already.

This has nothing to do with "fair". I am looking for an explanation why all proposals to simplify the tax code seem to think a straight line is somehow a requirement by mathematical rules or something. Until that is answered I really question the motives of people proposing this.

Re:What about all the new jobs in the "digital" ag on The Internet's Network Efficiencies Are Destroying the Middle Class · 2014-01-07 07:44 · Score: 1

A huge problem with this is that your "refund" would be a government handout program hundreds of times bigger than current Welfare, with all the inefficiencies and cheating and other problems.

I would love to see an actual logical explanation why "tax simplification" is always tied to "flat tax percentage". I see no reason. Yes, greatly simplify the rules to compute the income, but there is no reason the resulting number cannot be then taxed at a graduated rate. Conversely you could implement a "flat tax percentage" atop the current complex mess of deductions. I just don't see why it is always tied together (though I have suspicions that is has nothing to do with any technical requirement).

Also your "refund" is equivalent to a non-constant tax rate. It is 0% at the point income equals the refund, and approaches the "flat percentage" as the income approaches infinity. In your scheme it also goes to negative infinity percent as the income goes down to zero.

Re:KODAK is actually a good example. on The Internet's Network Efficiencies Are Destroying the Middle Class · 2014-01-07 07:22 · Score: 1

If you were truly interested in having everyone pay their "fair share", you'd tax a flat percentage of all income above poverty level, with no loopholes, deductions, or credits.

I'm curious because I have seen this mentioned many times. I think "no loopholes, deductions, or credits" is a great goal and would very much support it. However I am dubious about "flat percentage" and in fact you actually contradict that since your "all income above poverty level" is actually a tax of zero at "poverty level" rising asymptotically to the "flat percentage" as the income rises to infinity. I would prefer a smooth curve with no zero level and a continuous first derivative (your proposal and current tax brackets are continuous but have a non-continuous first derivative). This would make all people pay a non-zero tax which would help everybody to feel they are part of the system.

Is there any actual argument that says you must have "flat percentage" if you also have "no loopholes, deductions, or credits". I feel this is instead an attempt to destroy progressive taxation by tying it to simplified taxation because I sure don't see it.

Re:Sure, why not on Cairo 2D Graphics May Become Part of ISO C++ · 2014-01-06 08:22 · Score: 1

I think I agree with you. There should not be any templates.

I think the proper solution is to standardize on UTF-8, and use *ITERATORS* (not array index) to see the characters. This avoids the huge problem that "character" is not actually defined (ie what do you do with combining characters or invisible ones or with composite characters such as ½). Iterators will also allow errors in the UTF-8 to be returned as unique values, rather than using exceptions (which typically destroy the iterator) or lossy conversion to a character that could be valid encoded. This inability to handle errors is *HUGE* and I am really annoyed by all the people who want to pretend that some physical natural law is going to make them non-existent, this is so misguided that I am absolutely floored that anybody believes it, but most UTF-8 decoders act this way!

The fact that std::string::operator[] returns a non-const reference is another disgusting mistake, pandering to the idiots who think case-conversion is something users want and that it can be done by simply substituting "characters" (whatever a "character" is...). If operator[] returned a const reference we could use reference counting rather than having to copy std::strings.

Libraries that don't use iterators and thus require pre-conversion to an array of "characters" are broken. Such conversions are wasteful, must be redone depending on what type of "character" the algorithim needs, and are lossy. I am ignoring all the current C++ attempts at "Unicode handling" because of this. But the language itself still needs UTF-8 string constants with the ability to insert arbitrary bytes, even if a fixed library is provided.

Re:Sure, why not on Cairo 2D Graphics May Become Part of ISO C++ · 2014-01-06 05:12 · Score: 1

To stop discouraging use of Unicode. Currently if there is text on a disk that for some reason has a string of bytes that is not UTF-8, even if it is *mostly* UTF-8 and contains even the tiniest error, the programmer is often forced to abandon use of UTF-8 due to the perverse insistence of people (perhaps you) that somehow a subset of byte strings somehow violates god's principles and must be made impossible at all costs.

This is probably the biggest impediment to I18N, and the weird thing is that the people behind it actually think they are helping.

Re:Sure, why not on Cairo 2D Graphics May Become Part of ISO C++ · 2014-01-05 11:13 · Score: 1

What exactly is missing in C++11?

UTF-8 string constants (with \xNN producing raw bytes whether or not they are valid UTF-8).

These are not there due to Microsoft's insistence on translating the input to UTF-16 before the parser runs and they wanted to preserve the ability to make string constants in CP1252 or other 8-bit sets. Otherwise it would be trivial because it would involve copying the bytes unchanged from the source code to the string constant.

Re:Solution for population control on Illinois Law Grounds PETA Drones Meant To Harass Hunters · 2014-01-02 10:54 · Score: 1

Yes I agree, however I just wanted to point out that they are not ignoring the overpopulation problem. They have a "solution" even though it is probably impractical and not actually nicer to the animals.

Re:which is why hunters ask permission on Illinois Law Grounds PETA Drones Meant To Harass Hunters · 2014-01-02 07:24 · Score: 1

Most hunters may ask for permission (and obey if denied permission). He is worried about the non-zero sized set that does not do this.

Re:Solution for population control on Illinois Law Grounds PETA Drones Meant To Harass Hunters · 2014-01-02 07:21 · Score: 2

PETA does propose a solution: reintroduction of natural predators. Probably not an actual working solution but they are nowhere near as ignorant as you state.

Slashdot Mirror

User: spitzak

Comments · 5,741