Conference site » Proceedings

Conversation with #scipy at Sat 23 Aug 2008 10:34:08 AM PDT on (irc)

(10:34:50 AM) dstarr [n=chatzill@] entered the room.

(10:35:29 AM) eads: Hi there Matthew.

(10:36:29 AM) MatthewBrett: Hello - what are you working on?

(10:37:28 AM) stefanv: Hey guys

(10:38:09 AM) MatthewBrett: (mainly for Travis V) - I'm logging the chats from now...

(10:39:11 AM) stefanv: First post!

(10:39:14 AM) stefanv: :P

(10:42:04 AM) eads: Matthew: I posted this earlier but I'm working on restructuring some remaining docs in scipy.cluster so they obey the docs standard.

(10:44:05 AM) agrimstrup: I'm taking a run at the c structure to dtype converter.

(10:45:37 AM) stefanv: I'm working on a mock-up for a new front-page

(10:46:09 AM) eads: I think the design you showed me yesterday is an awesome start. It looked very clean.

(10:48:14 AM) stefanv: Great, I'm going to use some free icons until we design something ourselves... Just get the general layout there, and show it to people to get more feedback.

(10:52:23 AM) ondrej: enjoy hacking stefanv and others. :)

(10:54:14 AM) ondrej left the room (quit: "Leaving").

(11:10:36 AM) diane [] entered the room.

(11:15:52 AM) MatthewBrett: Chris Burns and I are working on some problems that came up using memmaps, I believe mainly with subclassing. Cindee Madison and Mike Trumpis are working at porting the matlab interface (and therefore Octave) from Sage to a numpy / BSD package (with William's kind permission).

(11:16:45 AM) MatthewBrett: On that note, does anyone have any experience with text parsing using modules like pyparser?

(11:16:51 AM) diane: yes

(11:16:58 AM) MatthewBrett: Pyparser?

(11:17:16 AM) diane: uh... trying to remember

(11:17:44 AM) diane: yes

(11:17:52 AM) MatthewBrett: Then I'll walk over to your table!

(11:20:10 AM) eads: I used ply and thought it was very easy to use having used bison, jlex, yacc, cup and several other parser generators in the past.

(11:20:44 AM) eads: It is a remarkably pleasant experience to put together a parser generator with ply.

(11:21:17 AM) eads: It's an LALR parser.

(11:21:39 AM) eads: I used it to parse a subset dialect of Python last year.

(11:22:30 AM) cyclone31 left the room (quit: "Leaving.").

(11:22:32 AM) diane: that might be better than pyparsing, I don't know if I was misusing pyparsing but it was taking tens of minutes to parse a 10 mb string

(11:26:39 AM) eads: My experience is that ply is pretty quick to parse.

(11:26:50 AM) eads: It's also entirely implemented in Python.

(11:27:08 AM) eads: It's written by David Beazley, the author of Python: Essential Reference.

(11:27:43 AM) eads: I would assume his an expert with how to use Python strings efficiently.

(11:27:50 AM) eads: err, he's an expert

(11:27:57 AM) diane: it looks promising

(11:28:11 AM) eads: Grammars are written as multi-line strings.

(11:28:21 AM) eads: Productions reside in functions prepended with "p<underscore>"

(11:28:53 AM) eads: The rules for a production, i.e. the right hand side rules of a production, are expressed in the production functions docstring.

(11:29:01 AM) eads: I think that's a good, clean design.

(11:29:16 AM) eads: In a few days, I was able to write a basic parser for a subset of python.

(11:29:47 AM) eads: Alas, I wrote it at the lab, and haven't yet obtained approval to show it to others but I'd be happy to help out with the effort of writing a matlab grammar.

(11:29:51 AM) eads: Is that what you are intending to do?

(11:30:04 AM) diane: I believe that's what they're trying to do

(11:30:09 AM) diane: (upstairs)

(11:30:27 AM) cyclone31 [] entered the room.

(11:30:44 AM) eads: I'm at LAX so I'd have to help remotely.

(11:30:51 AM) eads: but...

(11:30:56 AM) eads: MATLAB's syntax can get messy.

(11:31:16 AM) eads: Is the intention to just cover the common cases?

(11:31:16 AM) diane: I'm just sitting next to the people working on it...

(11:31:22 AM) eads: Or also the obscure ones?

(11:31:51 AM) diane: hi im cindee

(11:32:05 AM) diane: trying to make as flexible as possible

(11:32:13 AM) diane: but also only have octave installed

(11:32:56 AM) eads: The MathWorks examples in their manual tend to be pretty clean.

(11:33:06 AM) eads: I'd say if those run, we'd be off to a pretty good start.

(11:33:22 AM) eads: Is the objective to translate matlab programs into python programs?

(11:33:49 AM) eads: Or to write an interpreter running in Python that translates MATLAB expressions and statements into Numpy calls.

(11:33:50 AM) diane: they've gone and curled back up in their laptops

(11:34:04 AM) diane: My interpretation is they need to interface to some matlab code

(11:34:12 AM) eads: Also, one would need to preserve the pass-by-valueness of MATLAB.

(11:34:38 AM) dwf: It isn't quite pass by value

(11:34:49 AM) eads: Well, pass-by-value with copy-on-write.

(11:34:57 AM) dwf: right

(11:35:06 AM) dwf: but the semantics of numpy arrays are quite different

(11:35:18 AM) dwf: for one thing the existence of a rank 1 array

(11:35:31 AM) dwf: and the fact that transposing it does nothing for you

(11:35:32 AM) eads: Indeed.

(11:35:41 AM) dwf: that annoyed the hell out of me for a while

(11:36:06 AM) eads: Yes, and wrapping it with matrix(...) is a fix but it's not clean.

(11:36:31 AM) eads: Is there a read-only flag with numpy arrays?

(11:36:48 AM) dwf: not quite, I don't think

(11:37:31 AM) dwf: I could be wrong

(11:38:45 AM) eads: There is a WRITEABLE flag.

(11:39:52 AM) tvaught: a.flags['WRITEABLE'] = False

(11:39:52 AM) tvaught: In [7]:

(11:39:52 AM) tvaught: In [7]: a[0]=2

(11:39:52 AM) tvaught: ---------------------------------------------------------------------------

(11:39:53 AM) tvaught: <type 'exceptions.RuntimeError'> Traceback (most recent call last)

(11:39:53 AM) tvaught: <type 'exceptions.RuntimeError'>: array is not writeable

(11:39:54 AM) tvaught: In [8]:

(11:40:01 AM) tvaught: Works for me.

(11:40:10 AM) eads: Splendid.

(11:40:20 AM) eads: What would be nicer is to have a copy behavior.

(11:40:49 AM) dwf: As in, a DO_NOT_COPY_NEVER_EVER flag?

(11:41:15 AM) eads: Something where a MATLAB statement like my_function(X) gets translated into python as my_function(CopyOnWrite(X))

(11:41:55 AM) eads: CopyOnWrite(X) creates an object that's a subclass of ndarray, which copies the contents of X into a new buffer on the first write and uses that for subsequent writes in my_function

(11:42:05 AM) eads: In this way, my_function has its own 'version' of X.

(11:42:18 AM) eads: It may not be the cleanest solution but it's the first thing that came to mind for me.

(11:42:50 AM) eads: dwf: we want copy-on-write to preserve MATLAB semantics.

(11:43:11 AM) oliphant left the room (quit: ).

(11:43:33 AM) dwf: The problem with that is that then you're tying objects to having knowledge of the call stack

(11:43:53 AM) eads: No, you don't.

(11:43:56 AM) eads: I don't believe so.

(11:44:04 AM) dwf: well, if this only happens in my_function

(11:44:13 AM) dwf: unless you explicitly say x = CopyOnWrite(x)

(11:44:22 AM) dwf: at the beginning of my_function

(11:44:25 AM) dwf: that'd be fine

(11:44:46 AM) dwf: In fact, you'd be able to do that with a decorator

(11:44:48 AM) eads: Python technically isn't pass-by-reference.

(11:44:54 AM) eads: This is a somewhat pedantic point.

(11:44:59 AM) eads: You can't modify the caller's reference.

(11:45:03 AM) eads: The reference gets copied.

(11:45:56 AM) eads: You can use that copy in my_function as you wish.

(11:46:09 AM) dwf: No, I know

(11:46:16 AM) eads: Okay.

(11:46:46 AM) MatthewBrett: Sorry, was indulging in laxness of personal conversation. Parsing was for getting matlab variables printed to the matlab console using expect. And pumping python variables likewise. Thus we need to define the grammer of matlab formatting for the various objects such as matrix, cell array, struct array.

(11:47:02 AM) dwf: And modifying the reference within the scope of my_function is fine

(11:47:10 AM) MatthewBrett: As sage does it.

(11:47:13 AM) dwf: wrapping it in a CopyOnWrite, or a matrix()

(11:47:25 AM) dwf: well, as_matrix() I should say.

(11:48:23 AM) eads: I would imagine users who use matlab2py might want to eventually remove the protections as appropriate since the flexibility offered by reference semantics is much nicer.

(11:48:40 AM) eads: For example, dealing with large data sets.

(11:49:26 AM) dwf: oh god yes, even COW is insane if your data is too big

(11:50:47 AM) dwf: one of my profs writes matlab scripts like shell scripts, no function definitions, which as someone who went through a computer science degree I thought was insane

(11:51:52 AM) eads: Ick.

(11:52:11 AM) eads: I'm in favor of building your experiments and paper with make.

(11:52:29 AM) eads: In the interest of reproducable science.

(11:53:33 AM) dwf: indeed

(11:54:20 AM) eads: One of my colleagues got me into doing that several years ago. To build his book, it takes about a day because it reruns his experiments and redraws its plots.

(11:55:42 AM) dwf: So the status of is... not quite yet.

(12:02:32 PM) oliphant [n=oliphant@] entered the room.

(12:03:06 PM) oliphant left the room (quit: Client Quit).

(12:04:50 PM) tvaught: @dwf: do you want to elaborate on the points where you're stuck?

(12:07:16 PM) dwf: well, I've gotten over a bunch of hurdles already, for one thing the svn versions of py2app, modulegraph and macholib from actually get the app built properly

(12:07:17 PM) oliphant [n=oliphant@] entered the room.

(12:07:30 PM) dwf: it seems to not be including pyface for some reason

(12:08:16 PM) dwf: oh, maybe it is. hm.

(12:08:57 PM) dwf: ImportError: unable to import a pyface backend for any of the wx, qt4, null toolkits

(12:10:20 PM) oliphant left the room (quit: Client Quit).

(12:19:15 PM) eads: Committed my changes. Now boarding my flight. Talk to you guys later.

(12:19:23 PM) eads left the room (quit: "Ex-Chat").

(12:19:24 PM) diane left the room (quit: Read error: 110 (Connection timed out)).

(12:21:29 PM) dwf: stefanv:

(12:22:03 PM) stefanv: thanks

(12:22:31 PM) stefanv: "I am currently looking for cocoa developer(s) to take over this project. Please let me know if you're interested. "

(12:24:55 PM) stefanv: dwf: xchat available under macports

(12:25:00 PM) stefanv: v 2.8.6

(12:25:20 PM) dwf: probably requires X11 and looks ugly, though

(12:38:32 PM) detrout [] entered the room.

(01:11:40 PM) oliphant [] entered the room.

(01:16:07 PM) akakssss [] entered the room.

(01:20:28 PM) cburns [] entered the room.

(01:39:31 PM) droova [] entered the room.

(01:39:32 PM) akakssss left the room (quit: Read error: 104 (Connection reset by peer)).

(01:40:15 PM) droova left the room (quit: Client Quit).

(01:55:57 PM) detrout: I solved it by copying ipython to numpy/bin and changing the hardcoded path

(02:17:46 PM) dwf left the room (quit: "This computer has gone to sleep").

(02:26:50 PM) dwf [] entered the room.

(02:38:45 PM) redfox [] entered the room.

(02:38:59 PM) redfox is now known as eads

(02:40:16 PM) detrout left the room (quit: Read error: 110 (Connection timed out)).

(02:41:51 PM) eads: So where is the MATLAB parser going to live?

(02:50:10 PM) detrout [] entered the room.

(03:10:53 PM) eads left the room (quit: Read error: 110 (Connection timed out)).

(03:12:44 PM) detrout left the room (quit: "using sirc version 2.211+KSIRC/1.3.12").

(03:16:03 PM) agrimstrup: Any pyparsing experts in the house?

(03:18:08 PM) tvaught [] entered the room.

(03:19:48 PM) MatthewBrett: (matlab parser) - I don't know. We have to check with William S on the license. After - scikit /

(03:33:45 PM) detrout [] entered the room.

(03:42:32 PM) detrout left the room (quit: "using sirc version 2.211+KSIRC/1.3.12").

(04:14:14 PM) MatthewBrett: Memmap question for the team

(04:14:54 PM) MatthewBrett: As Chris and I achieve increased understanding of subclassing, we are going back to this page: /Subclasses

(04:16:16 PM) MatthewBrett: In the 'words of caution' section, we are discouraged from creating default attributes in the __new__ method. Two questions

(04:16:29 PM) MatthewBrett: It the statement that 'it is not thread safe' true?

(04:18:29 PM) MatthewBrett: Don't we (as per the example) _have_ to set any new attributes in the __new__ method (as well as __array_finalize_), because we can't set them in __array_finalize__, as we typically only reach this from the __new__ method by creating an ndarray, and we have no mechanism to set instance attributes there.

(04:51:24 PM) dwf: oliphant is probably the person to ask

(04:51:55 PM) dwf: but he may or may not be paying any attention to IRC

(04:53:00 PM) oliphant: I'm not sure about the thread safe question.

(04:53:48 PM) dwf: so I guess the answer is "no, it is not thread safe" and the statement is true, to the best of anyone's knowledge

(04:54:17 PM) oliphant: Perhaps, but it would nice to have a better explanation as to why it is not thread safe.

(04:54:59 PM) oliphant: I don't see why it is not thread safe.

(05:14:40 PM) MatthewBrett: I'll email the author and ask him - thanks for the feedback. Travis, any comment on my second question?

(05:59:16 PM) dwf: I think maybe saying 'oliphant' might have some mechanism to get his attention

(05:59:35 PM) oliphant: Yeah, Colloquy blinks at me.... :-)

(05:59:52 PM) dwf: aha - still there Matthew?

(06:00:08 PM) dwf: or should I say, MatthewBrett

(06:00:28 PM) cburns left the room (quit: ).

(06:00:29 PM) MatthewBrett: I am

(06:01:45 PM) MatthewBrett: I had a talk to Travis, the light shined, and I see my way clearly now.

(06:02:11 PM) dwf: oh, you're in the building too

(06:07:10 PM) __pv: In case someone wants to solve some riddles of the Sphinx, here's a bzr branch of ongoing work on the numpy reference guide: ; it pulls docstrings via sphinx.ext.autodoc + custom magick, but there is still some mileage to go in making the output pretty enough.