pmxbot IRC Log Viewer

[00:00:15] <landongn> anyone at all? just looking for some kind of stepping off point to dig, so far i've come up with squat.

[00:00:51] <landongn> it doesn't seem to affect performance much if any, but it's just something I can't explain and shouldn't have that sort of behavior (from the lack of systems using any kind of NoCursorOption or lack of limits/batch_size())

[00:18:29] <landongn> can anyone tell me if closeAllDatabases is safe to run on a primary in a replica set?

[01:16:53] <jamiel> Hi guys, having a bit of an issue debugging an issue using the PHP Mongo driver and was hoping someone could help. I have a unique index on an id column which is in fact a unique numerical strins ie. "123456" and in local environment everything is dandy. On staging we have sharding enabled and the shard key is this id + created_at timestamp. Identical documents insert fine in local environment but not staging and can't seem to get get any error out the driver

[01:40:19] <irishcoder> hello?

[01:40:36] <irishcoder> Is anyone here?

[01:40:53] <irishcoder> #mongodb

[01:43:01] <giessel> dgottlieb: so you know, i had to change 2 spots--- the one in builder.h (to 512mb), and another in dur.h. (increased UncommittedBytesLimit to the max unsigned int value- 4294967295)

[01:43:23] <giessel> dgottlieb: it compiled, but i haven't ran anything. this was just on my laptop, i'll do it on my workstation tomorrow AM and see if it works

[03:06:50] <jamiel> Do I need to pass my shard key with the upsert? what is the parameter? have tried _shardKey?

[03:07:44] <jamiel> Found an error in my log now: User Assertion: 8012:can't upsert something without valid shard key

[05:03:56] <tystr> is anyone here using 10gen's monitoring service?

[05:21:08] <tystr> wow, this looks pretty slick

[07:36:01] <[AD]Turbo> yo all

[07:42:24] <NodeX> howdee

[08:38:51] <wereHamster> question about data modelling. I have a model, it has 'spells'. In the application spells is a map from name -> Spell (names are unique, thus a map). In mongodb I can model it as a object, or as an array of {name, spell} pairs (and use a unique index to ensure names are unique).

[08:40:41] <wereHamster> when reading posts on the internet, the community seems to lean towards an array (because mongodb has better support for operations on arrays).

[08:44:20] <NodeX> is there a question coming?

[08:58:16] <wereHamster> oh, yes. should I use an array or an object? :P

[09:00:08] <NodeX> they are treated the same in terms of being able to query/index

[09:03:02] <kobold> wereHamster: I personally use arrays, because I prefer the {spells.name: "something"} query syntax over the computed key name "spells.something"

[09:04:07] <kobold> wereHamster: the latter is a bit faster if I recall correctly, though; I benchmarked the two options a couple of months ago, but the difference is very small

[09:04:52] <wereHamster> it's not about speed. The data will be loaded once from the database into the app once.

[09:40:02] <_johnny> i'm looking at http://www.mongodb.org/display/DOCS/Replica+Set+Tutorial but there's one thing i don't understand about elections: there's an example with a replication set of 3, where in failover the two remaining vote. one votes for itself, the other for the first => first is selected. how the first makes this decision (other than being the first to vote) i'm not sure, but my question: later in the docs it says running with two no

[10:04:18] <lackdrop> Does mongodb keep a query log? I want to see what queries a program is sending to it. Im tailing /usr/local/var/log/mongodb/output.log but theyre not there.

[10:05:51] <sirpengi> lackdrop: http://www.mongodb.org/display/DOCS/Database+Profiler does that help?

[10:09:50] <lackdrop> sirpengi: I saw that, but thought that it would just be profiling commands you ran through the mongo shell. Have I got that wrong? In any event, I cant even get the profile info for shell commands for some reason, though Ill keep at it.

[10:11:59] <lackdrop> sirpengi: Oh no, youre right, that does just what I need. Thanks.

[10:12:06] <sirpengi> np

[10:16:42] <cclever> Hi there, I have a problem with $slice. on one machine (2.0.6) my query returns only the sliced filed (and not the other properties of the document), and on the the other machine 2.0.4 it return the complete document which the correct part of the slices property. Any idea, how that can be?

[10:19:55] <NodeX> can you pastebin the query and the results from both machines

[10:24:32] <cclever> sure: http://pastebin.com/23sPqxir

[10:25:29] <cclever> please find the query on top, the the result from machine A and from Line 2383 the result from machine B

[10:29:11] <NodeX> I can;t even read the data properly - it's very long lol

[10:30:12] <lackdrop> Em, does update not support inserting a document if it does not exist or $inc a counter if it does? Im getting "Modifiers and non-modifiers cannot be mixed"...

[10:30:58] <cclever> NodeX, correct: It is long. let me paste you a better example

[10:31:27] <NodeX> you can $inc lackdrop but not on the upsert field

[10:32:09] <NodeX> http://www.mongodb.org/display/DOCS/Updating#Updating-UpsertswithModifiers

[10:33:13] <NodeX> $set : { blahblha }, $inc : {foo : 1} ...

[10:35:18] <lackdrop> NodeX: But this isnt valid? db.people.update( { name:"Joe" }, { x: 1, $inc: { y:1 } }, true ), meaning if Joe doesnt exist create him with x and y = 1, otherwise just inc y.

[10:35:57] <lackdrop> Oh, I suppose I could just $set x and that would work?

[10:38:39] <cclever> NodeX: better example http://pastebin.com/DuKEPRPh

[10:44:23] <NodeX> lackdrop : use $set

[10:45:01] <NodeX> cclever : you didn't do the same query on both in the paste

[10:46:42] <cclever> its the same query

[10:49:00] <NodeX> what is the differences in the machines apart from the version number?

[10:50:19] <cclever> One machine is Cent OS, one is ubuntu

[10:50:44] <cclever> one machine on mac / parallels, on on windows /vmware

[10:55:04] <NodeX> 32bit / 64bit?

[10:57:04] <cclever> both 64bit

[10:58:52] <NodeX> that is stranghe

[10:58:55] <NodeX> strange*

[11:03:42] <cclever> nodex: yes, it is. the documentation says "Filtering with $slice does not affect other fields inclusion/exclusion. It only applies within the array being sliced."

[11:53:20] <pnh7> Hi... can you give me more information about benchrun() ? I'm not quit sure about how to analyze the output of benchrun . I couldn't find more information on this except http://www.mongodb.org/display/DOCS/JS+Benchmarking+Harness

[12:07:48] <lackjax> In an update, suppose I use the criteria { "_id": 1234, "somefield1": 1, ..., "somefieldn": n } where the somefields arent indexed but the _id obviously is. Thats not going have any measurable performance impact is it? I mean, the _id will ensure that the record to be updated is found in O(1) right?

[12:16:28] <wereHamster> lackjax: why to you include the other fields? The id is guarnateed to be unique

[12:17:06] <remonvv> I was about to ask.

[12:18:54] <lackjax> wereHamster: Upserting. Trying to avoid having to specify those fields in a $set.

[12:20:42] <remonvv> why?

[12:21:43] <remonvv> Anyway, to answer your question, it's still an immediate b-tree walk so it's fast, close to O(1)

[12:25:56] <remonvv> it's actually O(log(N))

[12:31:01] <souza> Hello guys, i'm with a question, if i don't have a field in a collection and i use $set in a non existing field, this will be created?

[12:31:23] <kali> yes

[12:32:41] <remonvv> Wouldn't that take all of 2 seconds to test? Effort is your friend ;)

[12:33:48] <souza> remonvv: Because, i tried to do this, but it doesn't works, then i want to be sure! :P

[12:34:23] <remonvv> Fair enough.

[12:36:53] <ron> Fairy nuff

[12:38:08] <kali> souza: are you the one struggling with the C driver ?

[12:42:07] <souza> kali: i've to study hard, but now i got it!

[12:42:18] <souza> i'd

[12:43:34] <kali> souza: i just think i should have warned you against discovering C and mongo at the same time, and advise to start studying mongo with an easier to use language... maybe it's not too late

[12:46:43] <souza> kali: I'd no choice, i'd to learn C and MongoDB, at same time to use at research project that i'm working.

[12:56:00] <algernon> there's nothing wrong with learning mongo and C at the same time. the problem is that the official driver doesn't make that easy :P

[12:56:46] <augustl> what are typical strategies for adding maintaining indices on a db? I'm thinking I'll write a script that applies the indices to a given db.

[13:03:48] <kali> augustl: here we add a static method on the model classes, and we have a scriptlet that iterate over the classes and run the method if it exists, which is plugged in the deploy scripts

[13:04:37] <augustl> kali: I see

[13:04:45] <augustl> I suppose mongo handles adding already existing indexes?

[13:05:02] <augustl> or do I need to check for that so I don't re-add indexes that are already there?

[13:06:49] <kali> augustl: the ruby driver has a ensure_index... and the js console too. I think this is the semantics in all drivers, actualy

[13:08:39] <augustl> kali: ah

[13:08:56] <augustl> the node.js driver has it too, shouldn't be a problem then

[13:14:56] <icedstitch> what kinds of risks/concerns should I have in upgrading from Mongo 2.0.0 to 2.0.6 in ubuntu?

[13:15:07] <icedstitch> Any preparations i must do?

[13:21:46] <icedstitch> Have there been any known issues on upgrading minor versions of mongo before(from stable to stable releases)?

[13:30:53] <icedstitch> Basically read the release note then?

[13:30:57] <icedstitch> notes*

[13:33:06] <icedstitch> ok... now, for another question.

[13:33:52] <icedstitch> How does one find the release versions for .deb files for ubuntu? Is there a webpage for it?

[13:42:08] <jY> icedstitch: version is in the filename

[13:42:29] <jY> for the mongo apt repo.. it will always be the latest stable

[13:44:17] <icedstitch> jY: awesome. What kind of turnaround timeframe should I expect the .deb version to be ready when the 2.2 stable is released, same day, 3 days, or two+ weeks?

[13:47:10] <icedstitch> i'm presuming the tar.gz will be the first to be released with the package managers coming later.

[13:47:18] <boll> I can't seem to figure out to what extent mongod can take advantage of multiple cores and hyperthreading. Is there any official documentation available on the subject?

[13:47:55] <jY> icedstitch: my experience is same time

[13:50:59] <icedstitch> boll: it's the same issues mysql faces too.

[13:51:15] <icedstitch> i know litte of it, except via blog post i read on

[13:51:24] <icedstitch> that's all i can help with....

[13:51:39] <boll> I never really did any serious work on mysql, so I really have no idea how it works :)

[13:51:46] <icedstitch> give me a moment or two and I "should" be able to post you the link

[13:51:56] <boll> excellent, much appreciated

[13:52:10] <icedstitch> i have, and I never ran into the issues as I had with mongodb on my x64 linux archs

[13:52:21] <lackjax> Can this be done? I want to either insert { "_id": "1", name: "Bobo", "counter": 100 } or update the counter by 1 if it exists. My best effort is as follows, but doesnt work (insert counter 101). db.tests.update({ "_id": "1", "counter": 100 }, { "$set": { "name": "Bobo" }, "$inc": { "counter": 1 } }, true)

[13:54:41] <icedstitch> boll: http://blog.manjusri.org/2011/03/17/administrating-mongodb/ read this as this is pretty informative imo(i do wish to be corrected by the "experts" in this room if i should be), a little down the page you'll read about numactl

[13:54:45] <icedstitch> Good luck

[14:01:40] <lackjax> Is there anything like SERVER-340 implemented?

[14:02:23] <NodeX> lackjax : I told you how to do it eariler

[14:02:39] <NodeX> you also cannot upsert on _id

[14:03:52] <lackjax> I dont think I can use $set NodeX because I need to $inc counter. "Field name duplication not allowed with modifiers"

[14:04:24] <lackjax> NodeX: I think I can use _id because I am using my own ids for this collection, not BSON ids. Does that sound okay?

[14:04:49] <NodeX> you need another identifier

[14:05:33] <NodeX> db.foo.update({bar:1},{$set : {eggs:"ham"}, $inc : {number:42}},true);

[14:05:59] <NodeX> that will set or increment "number" by/to 42 for the document bar = 1

[14:07:53] <lackjax> NodeX: Right. But I need to set to one number (arbitrary) on inserts but increment by another number (1) on updates. Sounds like I need the stuff talked about in 340. Theres no way to do that at the moment?

[14:09:18] <NodeX> you cannot do it no

[14:09:35] <NodeX> that's an appside problem anyway

[14:09:40] <lackjax> fiddlesticks

[14:10:40] <NodeX> that's an old issue lol

[14:13:52] <lackjax> It seems odd to me that if you put counter: 100 in the criteria and $inc => counter: 1 in the update you get 101. Running the modifiers on something thats just been created I mean seems strange to me, but I presume thats intended behaviour?

[14:24:42] <spillere> can anyone help me with a query. My db has for example, {username: spillere, photos [{photo: name, url:url},{photo: name2, url:url2}, ]

[14:25:05] <spillere> if I have the name2, how can I get all the information on that {}

[14:28:41] <NodeX> db.foo.find({"photos.photo":"name2"});

[14:29:12] <modcure> nice, new to mongodb

[14:29:46] <modcure> NodeX, is there a way to see if it exists ?

[14:30:01] <NodeX> db.foo.find({"photos.photo":{$exists:true}});

[14:30:22] <NodeX> might be ...

[14:30:27] <NodeX> db.foo.find({"photos.photo":{'$exists':true}});

[14:30:49] <NodeX> I never do it on the shell so I forget if those meta's need to be quoted

[14:30:51] <spillere> what do $exists do?

[14:31:08] <NodeX> spillere : err it checks if there is a value

[14:31:37] <spillere> in python, i can do z = db.foo.find({"photos.photo":{'$exists':true}})

[14:31:44] <spillere> lets try :)

[14:32:02] <Swimming_Bird> $exists isn't very efficent if its on the indexed field

[14:33:55] <spillere> i'm doing db.dataz.find({'photos.filename':'zc2gjq0vfi.jpg'})

[14:34:04] <spillere> it basically show all the user's info

[14:34:39] <spillere> is there how to only show what's inside de {} of the name2? for example

[14:35:17] <NodeX> slice

[14:35:20] <NodeX> $slice

[14:35:36] <NodeX> http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields <--

[14:43:41] <spillere> NodeX thanks, but I can't quite understand all

[14:44:25] <spillere> when I do db.dataz.find({'username':'dansku'},{'photos.filename':'zc2gjq0vfi.jpg'}), it get all photos.filenames, and not just the JPG i want

[14:45:40] <NodeX> that's not how it works

[14:46:00] <NodeX> you can splice part of the object / part of an array and that's it

[14:46:18] <spillere> I know, but I didnt quite get how to get all between {} from where photo = name2

[14:46:27] <spillere> like

[14:46:57] <spillere> in photos.filename, how do I slice all inside {}

[14:47:29] <augustl> will ensureIndex({name: 1, userId: 1}, {unique: true}) make the name uniqueness constraint scoped to userId? So that the uniqueness only applies to other documents with the same userId?

[14:49:42] <NodeX> augustl : from what I remember it will make name and userId unqiue

[14:49:46] <DETROIT_POWER> pmxbot: Sssshhhh.

[14:50:11] <NodeX> name:a, userId:a -> true, name:a, userId:b -> true, name:a, userId:a -> dupe

[14:50:57] <NodeX> spillere : I dont understand what you want sorry

[14:50:59] <augustl> that's what I want, cool

[14:51:10] <spillere> http://stackoverflow.com/questions/6387072/mongodb-retrieve-array-subset/6783535#6783535

[14:51:15] <spillere> its basically that

[14:52:15] <DETROIT_POWER> Why does Mongodb have no mechanism to ensure documents in an index have homogeneous data structures?

[14:53:06] <NodeX> can you explain that differently DETROIT_POWER

[14:53:08] <DETROIT_POWER> And if I were to use Mongodb, can it export to SQL syntax to create tables?

[14:53:20] <NodeX> spillere : as one of the answers states - it is NOT possible at present

[14:53:37] <DETROIT_POWER> NodeX: Essentially, that documents are rows all having the same columns.

[14:53:46] <NodeX> because it's down to your app

[14:54:01] <NodeX> and the point of no schema is exactly that .. free will to do what you please

[14:54:15] <algernon> ...and mongo has no columns or rows.

[14:54:21] <NodeX> and that nor tables

[14:54:45] <DETROIT_POWER> algernon: A document can be thought of as a row, except that its columns can be different from any other row.

[14:54:57] <NodeX> best advice ever for someone thinking of moving from SQL to Mongo: Forget everything you know about RDBMS and start a fresh

[14:55:10] <DETROIT_POWER> :(

[14:55:14] <algernon> DETROIT_POWER: only as much as a duck is a hawk.

[14:55:16] <augustl> DETROIT_POWER: sounds like something you should do in your application. I don't see why a document database should have this in its core

[14:55:32] <DETROIT_POWER> Okay.

[14:55:39] <NodeX> this is an Appside problem which most people use some sort of ORM to deal with

[14:55:58] <spillere> NodeX I understand, thanks! :)

[14:56:04] <augustl> without knowing much about internals, perhaps write/read performance could improve by knowing beforehand the size of the documents?

[14:56:25] <DETROIT_POWER> Well, can I still do joins?

[14:56:30] <NodeX> nope

[14:56:31] <NodeX> lol

[14:56:31] <augustl> mongo does not have joins

[14:56:46] <NodeX> it does however have subsets / embedded documetns

[14:56:50] <NodeX> documents *

[14:56:54] <DETROIT_POWER> NodeX: Oh.

[14:56:54] <augustl> I'm not aware of any document databases with joins

[14:57:17] <DETROIT_POWER> NodeX: Where can I read about embedded documents?

[14:57:28] <augustl> DETROIT_POWER: the docs have some pages on schema design

[14:57:41] <NodeX> as I said - forget RDBMS and everything you know about it becuase it will hinder your development with mongo, you cannot think of it in terms of any relational anything

[14:57:56] <NodeX> your head will hurt, you have to re-think how you store / access data

[14:58:23] <DETROIT_POWER> Mr. Codd's DB2 ruined me

[14:58:25] <NodeX> DETROIT_POWER : Do you know much about Json ?

[14:58:43] <DETROIT_POWER> NodeX: I am familiar with it, yes.

[14:59:01] <NodeX> you can think of Mongo as searchable Json really

[14:59:08] <DETROIT_POWER> Oh, cool

[14:59:18] <modcure> spillere, im not sure how one would tackle that...

[14:59:20] <NodeX> you can go very deep with nesting arrays etc

[14:59:28] <augustl> DETROIT_POWER: click around on the mongodb home page then :P

[14:59:29] <NodeX> you can reach into arrays

[14:59:37] <DETROIT_POWER> So I can have an 8 GB JS object? :)

[14:59:45] <spillere> modcure i think ill try some python algorythom for that

[14:59:51] <NodeX> no

[14:59:52] <augustl> limit is 16mb

[14:59:57] <modcure> spillere, maybe exlude all fields except the array.. return the array to app..

[14:59:58] <NodeX> 16mb per doc iirc atm

[15:00:07] <NodeX> set to rise with Moore's law

[15:00:15] <giessel> is there anyone here who can talk to me about the code in /util/net

[15:00:19] <augustl> there are mechanisms to store binary blobs transparently across multiple docs though

[15:00:19] <modcure> spillere, or slice the array to return the second element

[15:00:32] <DETROIT_POWER> NodeX: What if I have an embedded document? Then can I get to 32mb?

[15:00:37] <NodeX> no

[15:00:42] <spillere> modcure i shouldnt slice it as the value may change order

[15:00:54] <NodeX> and I am not sure why you would want to

[15:01:04] <NodeX> that is alot of data to bring over the wire

[15:01:08] <DETROIT_POWER> NodeX: We store a lot of base64 data.

[15:01:26] <modcure> spillere, db.posts.find({"photos.photo":"name2"},{ username: 0, _id: 0 }); will return the array of documents..

[15:01:32] <DETROIT_POWER> Mostly stills of Madonna footage and sound.

[15:01:46] <NodeX> gridfs is what you want

[15:02:00] <NodeX> it will cluster and shard chunks of data in a nice scalable and fast way

[15:02:16] <modcure> NodeX, db.posts.find({"photos.photo":"name2"},{ username: 0, _id: 0 }, {photos:{$slice:[1,1]}}); <-- any reason why the $slice does not work in this case?

[15:02:26] <NodeX> http://www.mongodb.org/display/DOCS/GridFS

[15:03:05] <NodeX> modcure : I am not sure that slice works like that - I dont use it alot so I could be wrong

[15:04:08] <DETROIT_POWER> So what does Mongo use for evaluating ECMAScript? v8? :)

[15:05:08] <augustl> DETROIT_POWER: for map/reduce you mean? Or the shell?

[15:05:25] <DETROIT_POWER> augustl: Map/reduce

[15:06:21] <NodeX> spider monkey I think

[15:06:23] <augustl> DETROIT_POWER: spidermonkey

[15:06:41] <augustl> apparently replacing it with V8 is being considered

[15:06:43] <giessel> gridfs is so annoying to use

[15:06:46] <DETROIT_POWER> I am using the shell now and I just locked it up when I tried to get the Global Object with: var m = (function() { return this; })()

[15:06:52] <giessel> i'm about this close || to finishing my patch

[15:07:14] <DETROIT_POWER> Is Spidermonkey really good now?

[15:07:28] <DETROIT_POWER> Is the shell spidermonkey to?

[15:07:31] <DETROIT_POWER> *too

[15:07:33] <NodeX> my shell doesnt lock

[15:07:51] <spillere> maybe you guys can help me have another way to tackle this problems.

[15:07:55] <spillere> this is the db http://pastie.org/4160479

[15:08:02] <spillere> i have many pictures

[15:08:25] <DETROIT_POWER> NodeX: I am using the browser console on Mongodb's site.

[15:08:31] <spillere> but when someone goes to url.com/picture/NAME, which is the filename without the jpg

[15:08:44] <spillere> "file_folder" : "6gpebv9uat",

[15:08:51] <NodeX> spillere : I had the same problem when embedding pictures in arrays, I ended up making a pictures collection - it was just easier

[15:08:54] <DETROIT_POWER> var m = (function() { return this; })() > JS Error: RangeError: Maximum call stack size exceeded

[15:09:16] <NodeX> DETROIT_POWER : that could be the website's console.

[15:09:17] <spillere> how do you do it when you can't do joins?

[15:09:27] <modcure> another select

[15:09:29] <DETROIT_POWER> NodeX: Ah.

[15:09:39] <spillere> like have a collection for users a collection for pictures?

[15:09:50] <DETROIT_POWER> NodeX: Looks like it is. I can run jQuery from the console :)

[15:10:15] <spillere> NodeX so everypicture have it's own ID then?

[15:10:22] <NodeX> DETROIT_POWER : I moved from RDBMS about 2 ish years ago and have not found one thing (apart from aggregation) that I cannot do with Mongo, and its speed is extremely fast and it's mundo scalable

[15:10:27] <NodeX> I cannot recommend it more highly

[15:10:40] <NodeX> spillere : that's what I did and it worked better for me

[15:10:52] <spillere> one question

[15:11:03] <spillere> then for example, how can I show all pictures of one single user then?

[15:11:16] <NodeX> save the user id with the picture

[15:11:29] <NodeX> db.pictures.find({userid:1234});

[15:11:42] <spillere> it will display all pictures form that user

[15:11:45] <spillere> interesting

[15:11:45] <DETROIT_POWER> NodeX: How do you go about ensuring you don't suffer data loss? My boss thinks Postgresql is much safer for storing our CRM's data.

[15:12:07] <NodeX> data loss in transit or failures

[15:12:10] <NodeX> ?

[15:13:13] <NodeX> for transit there are options to make sure the data is "fsynced" to disk before the function returns an "ok", for redundancy there is what's called "replicaSets" which are master/slave

[15:13:25] <DETROIT_POWER> NodeX: Failures.

[15:13:34] <NodeX> you can make sure your data is written to N replicaSets before returning

[15:13:44] <NodeX> for failures there is journaling and an oplog

[15:13:51] <NodeX> plus your replicaeSets ^^

[15:14:15] <NodeX> with replicaSets (if I recall correctly) if the master goes down then a new master is automagically elected

[15:14:30] <DETROIT_POWER> Does waiting for fsync before the function returns kill performance though?

[15:14:35] <NodeX> not really

[15:14:45] <NodeX> depends how transactional you need

[15:14:56] <NodeX> I have not once lost data and I dont even do it

[15:15:14] <NodeX> the danger is things in RAM wont get written to disk in a power outage is all

[15:15:16] <augustl> DETROIT_POWER: if you have tabular data with schemas, why not use postgresql?

[15:15:34] <DETROIT_POWER> Are you using Mongodb in a commercial application?

[15:15:39] <NodeX> yes, many

[15:16:03] <NodeX> I have not touched SQL for a very long time

[15:16:48] <NodeX> I have built a Full CRM in Mongo, a Full CMS, a dating site, The UK's fastest Jobsite (aside google obviously)

[15:17:02] <modcure> nice

[15:17:09] <DETROIT_POWER> NodeX: Why do you prefer Mongo over SQL? And do you use an ORM?

[15:17:45] <NodeX> I prefer the way it stores / handles data, I dont like the quirks of *SQL, I dont like the lack of security of SQL, I dont use an ORM

[15:17:46] <dgottlieb> giessel: I don't know anything about those files, but I am curious what you're looking at in there

[15:17:54] <NodeX> I built my own wrapper to a driver as a helper

[15:17:59] <DETROIT_POWER> cool

[15:18:29] <NodeX> you can't (aslong as yo sanitise to some degree) overflow the system

[15:18:58] <NodeX> at worst someone can corrupt data, not r00t your server lol

[15:20:21] <NodeX> the only thing that let's it down is lack of full text seraching ... you can regex but it's a pain

[15:20:22] <augustl> NodeX: hi5 for building a wrapper over the drivers!

[15:20:29] <augustl> ORMs makes my life suck

[15:20:32] <NodeX> but I outsource that to SOLR anyway

[15:20:38] <DETROIT_POWER> NodeX: What do you use for your middleware? Sorry for all the questions!

[15:20:46] <augustl> hi5 for using solr for searching :D

[15:20:57] <NodeX> augustl : I dont like using frameworks or templating engines .. I like my apps fast

[15:21:06] <NodeX> Middleware ?

[15:21:25] <NodeX> augustl : I was gonna use Elasticsearch but it's a bit green for my app

[15:21:52] <augustl> that's the AWS thing? I prefer self-hosted..

[15:22:10] <DETROIT_POWER> NodeX: Where your application's model that syncs to Mongodb sits.

[15:22:11] <NodeX> DETROIT_POWER : I dont know what "MIddleware" is, I dont keep up with trendy words sorry

[15:22:19] <NodeX> ah, I use PHP

[15:22:39] <multiHYP> hi

[15:22:56] <NodeX> I sit redis, php, solr on the same line as Mongo

[15:22:56] <multiHYP> is the date or isodate in mongodb of any good?

[15:23:19] <NodeX> multiHYP : good for what ?

[15:23:24] <NodeX> making coffee .. no!

[15:23:27] <multiHYP> use

[15:23:28] <DETROIT_POWER> I see. Does the PHP driver keep a connection to Mongodb open across connections?

[15:23:39] <DETROIT_POWER> Err, across queries

[15:23:39] <NodeX> you can persist connections or open one per pool

[15:24:04] <NodeX> I used to persist on session_id but I found when my app was busy I would get too many open files errors

[15:24:22] <NodeX> so I dont bother now, plus I cache a hell of alot in redis

[15:24:23] <augustl> DETROIT_POWER: I do that for a Node.js app

[15:24:33] <giessel> dgottlieb: ya, it's a secondary check to document size

[15:24:39] <augustl> DETROIT_POWER: full text search = special type of indexes for textual data, something mongo doesn't have

[15:24:43] <giessel> dgottlieb: which it seems i have just fixedddddd (!)

[15:24:53] <giessel> inserted a 168mb doc

[15:25:01] <DETROIT_POWER> augustl: So text searching is slower?

[15:25:02] <NodeX> full text search - forexample ... SELECT * FROM bar WHERE (a LIKE '%B' or c LIKE '%D' etc etc

[15:25:12] <giessel> big data big data

[15:25:14] <multiHYP> i mean is anyone making use of mongodb's date facilities or date is done in the app layer?

[15:25:17] <augustl> DETROIT_POWER: typically, full text engines are horrible at searching for "first name NOT contains 'foo'" for example, due to the way the indexing works

[15:25:38] <dgottlieb> giessel: awesome! I saw the half gig BSON size and just thought there's no way this is going to work if the data is actually that big

[15:25:47] <NodeX> I realised after alot of messing about that it's far better to let SOLR do my seraching

[15:25:48] <giessel> lol

[15:25:48] <kali> multiHYP: i use the date type all the time

[15:25:55] <giessel> dgottlieb: haven't tried anythign that large

[15:25:57] <multiHYP> kali: how?

[15:26:03] <multiHYP> especially with custom formats.

[15:26:18] <giessel> dgottlieb: generally i think our largest objects will be around 150mb. but it remains to be seen.

[15:26:18] <kali> multiHYP: custom formats ? that's not a model thing

[15:26:19] <dgottlieb> giessel: I think 168MB is pretty much just as large as half a gig :)

[15:26:33] <NodeX> give or take a few hundred meg lol

[15:26:41] <DETROIT_POWER> So, it's just really slow to find a substring in Mongodb compared to SQL where you have LIKE / ILIKE?

[15:26:44] <multiHYP> kali: so how do you use mongodb's date?

[15:26:48] <giessel> dgottlieb: we're a really unique case- we're not pushing stuff over the web, this will be for local storage and analysis. our data doesn't break up well

[15:26:53] <NodeX> anything that's not prefixed yes

[15:26:55] <augustl> DETROIT_POWER: no, LIKE is similar (ish)

[15:27:03] <augustl> DETROIT_POWER: solr is much faster than LIKE in SQL

[15:27:03] <kali> multiHYP: to store date, in a canonical form

[15:27:05] <dgottlieb> multiHYP: most drivers will convert their driver specific date into mongo's date type

[15:27:07] <NodeX> as I say it has regex .... but Mongo is not designed for this kind of thing

[15:27:08] <kali> multiHYP: the rest is application land

[15:27:13] <giessel> dgottlieb: it's crazy because every time i've logged in here i see people complaining about the 16mb limit heh

[15:27:28] <multiHYP> kali: can you give a query example of that?

[15:27:30] <dgottlieb> multiHYP: sorry, most drivers will convert their languages standard date objects into mongo dates

[15:27:50] <NodeX> giessel : doesn't gridfs split it for you - I thought that was the point of it

[15:28:26] <algernon> not only is SOLR much faster than LIKE, it is also more powerful.

[15:28:34] <kali> multiHYP: db.foo.save({ now: ISODate()})

[15:28:42] <kali> multiHYP: honestly, i don't understand what you're asking

[15:28:42] <multiHYP> kali: ok thanks

[15:28:46] <NodeX> SOLR is a very very good tool

[15:28:54] <NodeX> if only Ebay could use it properly

[15:29:06] <augustl> DETROIT_POWER: saying that mongodb doesn't have full text search isn't very interesting tbh ;) Most databases doesn't.

[15:29:07] <multiHYP> dgottlieb: thats the thing, i am handling custom date formats in app layer(java) too

[15:29:07] <NodeX> I could find what I want without 14 searches surrounding my thing

[15:29:09] <kali> lucene is great, solr is... ok

[15:29:17] <DETROIT_POWER> augustl: ah

[15:29:23] <NodeX> DETROIT_POWER : it does't, yo have to take care of that in your App

[15:29:34] <multiHYP> so at one level I use only yyyymmdd and in another place I use hh:mm:ss

[15:29:44] <NodeX> yo -> you *

[15:29:48] <dgottlieb> multiHYP: custom date string* formats or custom date objects that aren't java.util.Date ?

[15:29:49] <multiHYP> mongo date does not allow that sort of thing.

[15:29:56] <DETROIT_POWER> So SOLR would give me a bunch of primary keys/document ids?

[15:30:00] <kali> multiHYP: nope, mongo does not store the format, it stock the time

[15:30:01] <augustl> DETROIT_POWER: typically, when you call out to store your data in mongo, you also call out to solr with your data to index it

[15:30:05] <kali> multiHYP: store

[15:30:05] <multiHYP> they are saved as strings yes

[15:30:12] <giessel> dgottlieb: it does, but it is a pain in the ass to work with

[15:30:18] <DETROIT_POWER> augustl: Ah!

[15:30:35] <augustl> DETROIT_POWER: that depends on what you index in solr, you can have it return the indexed data, or the ids and then do a second query to mongo and get actual data from there

[15:30:36] <NodeX> DETROIT_POWER : no, mongo would be your datastore and db for main seraching, if you have things you need to use lucene on you revert to solr

[15:30:38] <kali> multiHYP: storing dates as string is usually asking for trouble :)

[15:30:46] <giessel> had to build a meta layer of code to handle dealing with meta data and such

[15:30:51] <NodeX> for example. I index my "jobs" collection, bu not my "users"

[15:30:55] <NodeX> (in SOLR)

[15:30:55] <multiHYP> kali: there is no other option…!?

[15:31:15] <multiHYP> in mongo you cannot do a date that consists of only year, month and day

[15:31:19] <giessel> NodeX: er sorry, i sent that to dgottlieb instead

[15:31:21] <remonvv> MongoDB stores UTC dates, java.util.Date is always UTC

[15:31:28] <giessel> NodeX: gridfs is a pain in the ass

[15:31:37] <multiHYP> remonvv: thats irrelevant.

[15:31:58] <NodeX> giessel : I have not used it since 1.5 when I had some problems with it

[15:32:07] <NodeX> I am out of touch with it

[15:32:22] <kali> multiHYP: year month and day is equivalent to year month and day at midnight..

[15:32:22] <giessel> NodeX: it does work, it's just inconvenient and takes a lot of consideration to get it right

[15:32:27] <remonvv> No it isn't. This discussion comes by every other day. What date you're storing and how it's visualized is not related. If you want to store a year/month/day only simply store a clamped date and visualize just the year, month and day

[15:32:36] <multiHYP> kali: yes

[15:32:38] <multiHYP> and?

[15:33:04] <giessel> NodeX: we had to search through our dictionary, looking for data of a given type, then encode those data, insert them in the gridfs and replace them in our dictionary with a gridfs path

[15:33:05] <multiHYP> remonvv: clamped date?

[15:33:17] <DETROIT_POWER> NodeX: So if I wanted to do a full text search *and* handle another criterion like a price range, I would take the document IDs output by SOLR and supply them to Mongodb along with my price range condition?

[15:33:18] <NodeX> ah

[15:33:18] <kali> multiHYP: so you store that in mongo, and display whatever you need to display in your app

[15:33:35] <giessel> NodeX: if you wanted to delete something, you had to make sure that the gridfs file disappeared or you'd have a space leak

[15:33:38] <remonvv> 2012/12/01 23:17:12 -> 2012/12/01 00:00:00

[15:33:41] <multiHYP> i know, but its bigger in size and unnecessarily verbose

[15:33:54] <remonvv> you either store a date or a non-date value that happens to look like part of a date.

[15:34:05] <multiHYP> its like saving too much useless repetitive date information.

[15:34:08] <NodeX> DETROIT_POWER : one of my apps is pure SOLR for searching, but it's because of my need for facets and FTM. If I didn't need it I would do it all in Mongo as it's less data to handle

[15:34:31] <multiHYP> remonvv: so thats what I'm asking. is it possible to have a custom date stored in mongodb?

[15:34:35] <remonvv> multiHYP. there are VERY few valid use cases for storing just the year, month and day that are not related to visualisation

[15:34:58] <multiHYP> again, i don't care about visualisation either.

[15:35:04] <remonvv> yes, convert it to something you're more comfortable with and store that. It just wont be a date (and date specific functionality will therefor not work)

[15:35:05] <NodeX> DETROIT_POWER : in SOLR you can store/index as much as you want... I spit out alot of data to alot of users so I prefer to not hit mongo everytime I need to do that and I store enough for a "quick view" of a job for example

[15:35:09] <multiHYP> I'm talking strictly date

[15:35:16] <remonvv> okay, so store the year/month/day bytes in a binary blob and store that

[15:35:25] <NodeX> you can do it either way - it depends on how you want to model your data

[15:35:25] <remonvv> it'll still allow range queries and everything

[15:35:26] <multiHYP> ok thats what I'm doing already. mongo deals with them as strings

[15:35:39] <multiHYP> oh

[15:35:53] <multiHYP> as binary blob? how would that be in java?

[15:35:55] <remonvv> no it doesn't. All mongo range checking is binary based so if you store it binary correctly (e.g. store year, month, day in that order) it'll work

[15:36:05] <remonvv> in Java. Hm, think it has a wrapper type

[15:36:09] <DETROIT_POWER> I see. But if you have a complex query combined with a full text search, you'd still need to hit up Mongo? For example, searching within a date range.

[15:36:09] <remonvv> let me check

[15:36:18] <multiHYP> yes i know that, thats why i have yyyymmdd now, for easy sorting.

[15:36:32] <remonvv> Ah, byte[] works these days

[15:36:38] <multiHYP> thats no binary though, its string.

[15:36:39] <kali> multiHYP: nope, it's actualy smaller. a date in bson in 64 bits. any string storage will be bigger

[15:36:42] <remonvv> no..yyyymmdd is not binary

[15:36:50] <remonvv> "20120101" you mean?

[15:36:55] <multiHYP> yes

[15:37:00] <remonvv> also sortable but much bigger than just storing a date

[15:37:03] <NodeX> DETROIT_POWER : one thing I will recommend you bear in mind with Mongo is that everything is typecasted. in SQL you can do SELECT foo FROM bar WHERE foo=1; .... SELECT foo FROM bar WHERE foo='1' - both being the same (in mysql atleast)

[15:37:03] <multiHYP> string but still sortable.

[15:37:07] <remonvv> i thought you were trying to save space

[15:37:10] <multiHYP> oh

[15:37:18] <multiHYP> i do

[15:37:20] <remonvv> a date is very small

[15:37:21] <multiHYP> didn't know that

[15:37:21] <NodeX> DETROIT_POWER : no, you would do the whole thing in SOLR because it does it all

[15:37:23] <giessel> ok, logging for now

[15:37:28] <remonvv> the only way to get it smaller is only storing part of it

[15:37:35] <DETROIT_POWER> NodeX: Oh!

[15:37:48] <NodeX> but solr is volitile so you need a datastore too - either mongo, redis, SQL w/e

[15:37:49] <remonvv> binary blobs have an additional length bit so i'm not even sure if it helps that much or at all

[15:37:55] <giessel> dgottlieb: thanks again. seems to be working for now. until it doesn't! ta!

[15:38:10] <multiHYP> so how big is ISODate or mongodb date?

[15:38:10] <DETROIT_POWER> I see

[15:38:27] <NodeX> that said you can do a range query with no FTM in mongo easily, you can do "in (a,b,c)" in mongo

[15:38:27] <remonvv> 64 bits

[15:38:31] <NodeX> that sort of thing

[15:38:43] <NodeX> $gt, $lt,

[15:38:54] <multiHYP> how big would the string "20120627" be?

[15:38:55] <NodeX> you can do upserts - which is one of the best things ever invented

[15:39:08] <multiHYP> well key, value: {"date":"20120627"}

[15:39:22] <kali> 4 bytes for size, 8 for data, 1 for final 0 -> 13, I win

[15:39:29] <remonvv> multiHYP, 64 bits. Smallest amount of bits to get year/month/day in is probably years since 1970 plus month plus day so say 7 + 5 + 6 = 18 bits

[15:39:35] <multiHYP> ?

[15:39:49] <kali> date type -> just 8 bytes

[15:39:53] <remonvv> kali, that's for a string right? binary can be smaller

[15:39:57] <NodeX> geeks :P

[15:40:03] <remonvv> full date is 8 bytes, a year/month/day can be done in 3

[15:40:18] <DETROIT_POWER> remonvv: Speaking of 64 bits, is Mongodb better on a 64-bit CPU?

[15:40:19] <multiHYP> still the question is: whats the size of that key value?

[15:40:23] <remonvv> DETROIT_POWER, yes

[15:40:24] <kali> remonv: binary::=int32 subtype (byte*)

[15:40:30] <multiHYP> ISODate is 64 bits, what about that.?

[15:40:31] <kali> remonvv: same :)

[15:40:31] <NodeX> on 32bit you're limited to 2gb

[15:40:38] <NodeX> 2gb collections **

[15:40:41] <remonvv> multiHYP, 4 bits for the binary length field plus 3 bytes, so 7

[15:40:44] <remonvv> you're winning 1 byte

[15:40:46] <remonvv> for a lot of hassle

[15:40:50] <multiHYP> :D

[15:40:53] <multiHYP> i know

[15:40:56] <remonvv> kali, not quite ;)

[15:41:01] <multiHYP> but the db looks cleaner

[15:41:09] <remonvv> a lot cleaner yes

[15:41:11] <kali> remonvv: you need the subtype ?

[15:41:12] <multiHYP> 8 bytes is a lot

[15:41:29] <DETROIT_POWER> NodeX: I probably need to get SOLR soon. LIKE is getting slow now that there's thousands of rows :(

[15:41:37] <remonvv> kali, ah, you're right. Apologies ;)

[15:41:37] <kali> cleaner ? ho yeah, i love when my database spew binary stuff at me

[15:41:48] <NodeX> do you do alot of FTM DETROIT_POWER ?

[15:41:48] <remonvv> i think he meant date is cleaner ;)

[15:41:50] <remonvv> i hope he did

[15:41:58] <kali> i hope he does :)

[15:42:09] <kali> but i'm not sure :)

[15:42:20] <multiHYP> kali: its not binary: its {"date":"20120627"} as opposed to ISODate("2012-06-27T15:39:24.886Z")

[15:42:23] <remonvv> long story short, unless you're storing 100 dates per document and that 5 bytes header is less of an issue i'd just use dates

[15:42:40] <multiHYP> now which one looks cleaner?

[15:42:42] <kali> multiHYP: your date as a string consumes 13 bytes.

[15:42:47] <NodeX> DETROIT_POWER: some of my solr stats.. 250,000 (ish) document database, full geo spatial bounding box search with 4 varying keywords search time is 19ms

[15:43:02] <multiHYP> oh

[15:43:14] <NodeX> that index is currently 250mb of my memory or so

[15:43:14] <kali> multiHYP: date as a date, with hhmmss to 0 looks fine to me

[15:43:25] <kali> and is 8 bytes

[15:43:32] <remonvv> if you're sure all dates you need to store are in the future you might just save "120627" but that's still larger

[15:43:32] <multiHYP> well, I do need potentially 100s of dates in every document that need to be hh:mm:ss.

[15:43:39] <DETROIT_POWER> NodeX: What's FTM? :)

[15:43:52] <multiHYP> kali: how to make such date in mongodb though?

[15:43:57] <NodeX> Full text match ... oh and SOLR / Lucene Ranks and scores each document

[15:44:05] <multiHYP> there is no room for customisation or formatting.

[15:44:08] <DETROIT_POWER> NodeX: That's amazing. My Postgresql search queries are taking > 250ms

[15:44:14] <remonvv> Just use date ;)

[15:44:17] <kali> multiHYP: formatting is not mongodb place

[15:44:23] <kali> multiHYP: this is your app place

[15:44:33] <remonvv> Use date, use a DateFormat that's just yyyy/mm/dd and you're done

[15:44:41] <remonvv> Efficient, quick, date specific functionality works, etc.

[15:44:45] <multiHYP> remonvv: thats what i already have.

[15:44:46] <NodeX> you gain power when you abstract the searching to software that is designed purley for searching

[15:44:50] <remonvv> then you're done

[15:44:54] <NodeX> *SQL has to be designed for all of it

[15:44:57] <remonvv> if only we'd all be that lucky.

[15:45:04] <multiHYP> oh you mean use mognodb's default ISODate underneath?

[15:45:05] <NodeX> store, search, index, foo, bar

[15:45:11] <remonvv> yes

[15:45:17] <multiHYP> no i don't have that

[15:45:22] <multiHYP> because of the verbosity.

[15:45:37] <remonvv> what verbosity? it's the smallest in bytes and you just said it's not relevant for visualisation

[15:46:00] <multiHYP> yes, but when you search in mongo shell, the results are just ugly...

[15:46:21] <multiHYP> and the yyyy-mm-ddT just keeps repeating and so on.

[15:46:37] <remonvv> so you want to change your schema so it looks better in shell?

[15:46:52] <remonvv> then strings are your only alternative

[15:47:01] <remonvv> format better but are less efficient in terms of size

[15:47:02] <multiHYP> thats what happened yes, not that i want it necessarily.

[15:47:07] <remonvv> bit of a strange requirement though

[15:47:26] <multiHYP> yes i might need to revise that :)

[15:47:30] <multiHYP> *shrugs*

[15:51:00] <multiHYP> is there a benchmark between postgreSQL and mongodb?

[15:52:55] <multiHYP> i might have touched upon a sensitive issue.

[15:53:12] <NodeX> there are plenty on the internet

[15:53:47] <multiHYP> yes I'm watching some of those right now :)

[15:55:20] <multiHYP> for my use case though, it doesn't matter.

[15:56:22] <remonvv> You should also look at PostgreSQL vs MySQL benchmarks then ;)

[15:56:43] <multiHYP> any of you using PostgreSQL?

[15:56:51] <NodeX> I use the right toll for the right job

[15:56:53] <remonvv> If I had to pick between PostgreSQL and having my scrotum nailed to a board I'd be like "hand me that nailgun"

[15:57:06] <multiHYP> :D

[15:57:10] <NodeX> if I wanted to aggregate data and make lots of varied reports I would use a graphdb

[15:57:17] <multiHYP> yes mongo is easy for me too.

[15:57:30] <multiHYP> R

[15:57:39] <multiHYP> and csv :)

[15:57:44] <NodeX> mongo works how my head works - the opposite to relational lmfao

[15:57:55] <multiHYP> yep

[15:58:58] <multiHYP> I know what I will do, in db i use ISODate for everything, but where I need the date, I label it as "date" and where the time matters, I put it under "time" label.

[15:59:19] <NodeX> I normaly break the data into an object

[15:59:34] <NodeX> date.dmy date.hms

[15:59:41] <multiHYP> which brings another calculation to mind: {"date":ISODate()} might be still bigger in size than {"date":"20120627"}

[16:00:00] <multiHYP> NodeX: please explain

[16:00:42] <multiHYP> is there documentation for what these js objects consist of?

[16:01:28] <multiHYP> aha leave the parenthesis out. :)

[16:01:52] <NodeX> my date objects look like this ... { date : { dmy: "20120627", hms:"00:00:00"}} for exmaple

[16:02:14] <multiHYP> ok so also the string approach, handling it in the app layer.

[16:02:14] <NodeX> might not work for you but it allows great flexability if/when querying

[16:02:42] <multiHYP> nah thats very similar to what I have now.

[16:02:52] <NodeX> added benefit of regexing the year, month, 10 day blocks if needed

[16:03:14] <multiHYP> yes i do that too :)

[16:04:38] <multiHYP> NodeX: that takes more space though.

[16:05:10] <multiHYP> each of those strings is bigger than an ISODate object.

[16:06:54] <multiHYP> this calculation hasn't been resolved guys: {"date":ISODate()} is bigger in size or {"date":"20120627"}?

[16:13:52] <southbay> Hi all.

[16:14:00] <remonvv> I've given up. Night night.

[16:20:11] <southbay> Can anyone reccomend an article which discusses the syntax for reaching into an array of documents to pull out specific fields in that document which sits in the array? I'm not sure if I must use a forloop or if there is dot notation that is possible.

[16:26:13] <multiHYP> southbay: you need to fetch the entire array and iterate though it.

[16:26:20] <multiHYP> dumb array.

[16:26:30] <multiHYP> if java or scala, cast it.

[16:33:00] <southbay> Yeah, I was hoping to write some test code through the mongo cli but looks like I'll have to use LINQ to grab the info. : \. Pardon the noob request, I'm looking to take the turn away from MS SQL and use Mongo instead.

[16:37:24] <multiHYP> good choice :)

[16:40:03] <southbay> multiHyp: I realized my syntax was wrong. You can use dot notation to reference fields within documents which reside in an array structure.

[16:40:30] <multiHYP> you could do that, yes.

[16:41:01] <multiHYP> I'm doing it another way: my_array: [{}, {}, {}, {}] anonymous objects inside it.

[16:41:21] <multiHYP> that way I'm not sure if the dot notation would work.

[16:41:59] <multiHYP> unless you want an array element whose specific field equals to some value.

[16:42:12] <multiHYP> time for football...

[16:51:56] <igor47> so, i'm building a test environment; can i have a replica set with just one machine in it?

[16:55:02] <igor47> my one node became primary, but php is upset and says "couldn't determine master"

[17:21:17] <southbay> igor47: I have not delved into the replication piece of mongo yet, but is there any issue with the port that's bound or IP that's bound?

[17:21:35] <southbay> binding conflict is what I'm getting at.

[17:31:02] <_johnny> is it possible to use geospatial to check an input coordinate against a polygon? so instead of looking through the db to find items with location within the poly, rather check if a coordinate is within -- or should i rather implement a algorithm specifically to that?

[17:31:27] <_johnny> -- that's essentially what i'm trying to avoid. i'd like to use mongo's engine for it :)

[18:20:19] <jenner> guys, I'm getting lots of these http://pastie.org/private/okokf2vs2clckg3fj10qwg does anyone know what they mean?

[18:24:09] <mids> jenner: with version of mongo?

[18:24:17] <jenner> mids: 2.x

[18:24:25] <jenner> 2.0.6

[18:27:01] <jenner> mids: this looks like an index issue, right?

[18:27:34] <mids> found some related stacktraces that were index issues

[18:27:41] <mids> maybe you can reindex?

[18:27:50] <mids> though hard to say; not familiar with the internals of mongo

[18:37:53] <Loch_> So how do I find one object, then find something inside of it and return the inner object?

[18:38:57] <Loch_> Here's what I want: users.findOne({'name':'test'}).games.findOne({'_id':1})

[18:38:59] <phatduckk> anyone here using morphia?

[18:39:49] <mids> Loch_: you cant just return an inner object

[18:40:56] <Loch_> Alright, so I can do the findOne on the users, then return the whole object, but then I cannot use the MongoDB methods to find a particular game in that object, correct?

[18:41:29] <mids> correct

[18:54:12] <Aram> hi, I want some key, other than _id to be unique, I can do this by calling ensureIndex, but is there other way?

[18:54:47] <mids> nope, you do it by creating an index with the unique:true attribute

[18:55:22] <Aram> well thanks.

[18:55:38] <mids> why is that a problem for you?

[18:56:11] <Aram> no problem, just curious if I could do it without writing extra code.

[18:56:14] <Aram> but no problem.

[18:57:39] <tubbo> hi guys

[18:57:58] <tubbo> i'm looking to convert some data OFF of mongo and onto PostgreSQL. i want to do this by exporting CSV and importing those CSV files into PG directly with the COPY command

[18:59:19] <mids> okay

[18:59:23] <tubbo> however, i'm running into some major issues...like it seems `mongoexport` is incapable of using the hashed BSON::ObjectIds

[18:59:37] <tubbo> therefore, i don't see how it's possible to actually import that CSV into a SQL database...

[18:59:53] <tubbo> am i missing something?

[18:59:55] <mids> do you need those object ids?

[19:00:58] <tubbo> mids: yes, they map to relations in the database

[19:01:41] <tubbo> mids: basically, in order to preserve associations i need to convert all BSON objectids into the numerical ID assigned when the row is inserted into postgres

[19:02:00] <tubbo> eventually

[19:02:10] <mids> how much data are we talking about?

[19:02:12] <tubbo> but yeah i do need them. is it not possible to export them?

[19:02:22] <tubbo> mids: about 20,000 objects per table

[19:02:30] <tubbo> s/table/collection

[19:02:36] <mids> I just exported something with objectIds; no problem

[19:02:41] <mids> not sure what you mean with hashed object ids though

[19:02:46] <mids> can you pastebin an example?

[19:03:47] <tubbo> sure

[19:04:05] <tubbo> mids: this is what my CSV looks like https://gist.github.com/e1c86249be464cd1a1e1

[19:04:42] <tubbo> mids: so for example, ObjectId(4f6340c6a7ed657c49000002) needs to just be "4f6340c6a7ed657c49000002"

[19:08:31] <tubbo> is there any way to tell `mongoexport` "just cast ObjectId() into a String?"

[19:09:11] <mids> no, but you can fix that with sed or some other unix tool

[19:16:49] <mids> tubbo: something like this: sed -r -e 's/ObjectID$([0-9a-f]+)$/\1/g' <dev.csv

[19:17:29] <tubbo> mids: is that < a typo or is that the opposite of >?

[19:17:44] <erick2red> can I use mongodb for a project where the files I plan to storage in it won't be bigger of 300~400 Mb ?

[19:18:22] <mids> tubbo: its read from dev.csv; guess it is optional in this case

[19:20:08] <tubbo> mids: sed threw an error, illegal option -r. what does that do?

[19:21:05] <mids> extended regular expression support

[19:21:17] <mids> using OSX?

[19:21:40] <tubbo> mids: yeah

[19:21:54] <mids> try -E

[19:22:06] <tubbo> so just sed -e 's/ObjectID(([0-9a-f]+))//g'

[19:22:07] <tubbo> ?

[19:22:10] <tubbo> sorry

[19:22:15] <tubbo> sed -E 's/ObjectID(([0-9a-f]+))//g'

[19:22:19] <mids> -Ee

[19:22:22] <tubbo> ok

[19:23:12] <mids> and you are missing some \'s

[19:23:33] <tubbo> ?

[19:23:36] <tubbo> i copied it verbatim..

[19:23:49] <tubbo> wtf wait

[19:23:50] <mids> nope :)

[19:23:52] <tubbo> why is it different lol

[19:24:17] <mids> erick2red: http://www.mongodb.org/display/DOCS/GridFS

[19:24:41] <tubbo> 's/ObjectID$([0-9a-f]+)$/\1/g'

[19:24:45] <erick2red> mids, I read already, the thing is that I heard mongodb is too heavy for simple usse

[19:24:46] <tubbo> 's/ObjectID(([0-9a-f]+))//g'

[19:24:49] <tubbo> wtf..

[19:25:59] <tubbo> what the hell i'm copying it VERBATIM

[19:26:02] <mids> erick2red: oh

[19:26:02] <tubbo> and it's different

[19:26:03] <tubbo> wtf

[19:26:29] <tubbo> ok NOW it's right lol

[19:26:58] <tubbo> mids: yeah that sed line still isn't doing anything

[19:27:32] <tubbo> omg i know why

[19:27:32] <tubbo> lol

[19:27:37] <tubbo> it was escaping the slashes

[19:28:27] <erick2red> mids, what I want to know is some impressions of the people who use it, I'm considering use it for some project, and I want to know first what you people think of it

[19:28:44] <mids> erick2red: you need to be more specific.

[19:28:54] <tubbo> erick2red: you're asking #mongodb what they think of mongodb? that sounds COMPLETELY unbiased lol

[19:30:25] <tubbo> lol thanks mids :)

[19:30:34] <mids> np :D

[19:30:34] <tubbo> mids: i'm actually running this through a Rake task so that's why it was escaping :)

[19:31:14] <erick2red> tubbo, I know the answers can be biased, but I can manage that, actually the impressions I heard before are biasd as well in the mood of "Oracle DB FTW" and stuff

[19:31:19] <mids> tubbo: if you need to turn these into numeric IDs anyway, you can do that in your Ruby code in one pass

[19:31:32] <mids> no need to use obscure tools like sed :)

[19:32:17] <mids> and storing 400mb files in Oracle doesn't sound heavy?

[19:32:19] <tubbo> mids: sorta. this is safer because i can create temp tables. it's more annoying to do that totally within ruby

[19:32:51] <tubbo> my idea is to extract everything to temp tables, then create the new schema, then i just import the relevant data, converting the shit i need to convert along the way

[19:32:59] <tubbo> but this way nothing gets lost accidentally, and we can always roll back

[19:33:29] <mids> ok

[19:37:57] <bosky101> hi, my mongo shell queries on a replicaset are returning in sub-second, while the native mongo driver is taking 10 minutes on a server with no load ( the final number of docs is > 1 million rows ). any idea what i can do to tweak this or why its taking time ?

[19:40:26] <mids> bosky101: native, which nation?

[19:41:04] <bosky101> mids: why

[19:41:19] <wereHamster> mids: inuit

[19:41:36] <mids> I guess 'nature' would be ther proper noun

[19:45:21] <erick2red> mids, tubbo so easy it light enough for a hobby project or should I stick with mysql ?

[19:46:13] <mids> erick2red: I use it for most of my hobby projects

[19:47:03] <erick2red> mids, for kinda cloud storage system I think is a great fit, right ?

[19:47:33] <erick2red> mids, you could be a rockerfeller and you're hobby projects cold involve rocket buildings and stufff, jejeje

[19:48:39] <mids> you got me

[19:51:13] <erick2red> so, I think gridfs is great for using in a dsitributed filesystem right ?

[19:56:01] <planrich> hi, i was wondering if my approach is the mongo way:

[19:56:03] <planrich> i have a document like so:

[19:56:05] <planrich> { "name" : "abc", ... , "measurements" : [ { "name" : "oxigentest", "value": "20" }, ... ] }

[19:56:09] <planrich> so the whole document has in id when i insert, but the measurements do not have an id. basically i would like to identify each of my measurements by some kind of id. is the index a good choice?

[19:57:31] <erick2red> does mongodb has fts ?

[19:59:08] <planrich> might be helpful: http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo

[19:59:43] <mids> planrich: what do you need to do with the measurements?

[20:00:18] <planrich> mids: delete/update the values, so i need some kind of identification of each measurement

[20:02:39] <planrich> and of course i can use the index to find the measurement again (i suppose mongodb will not reorder arrays until i tell it to) but my question is if the index of an array is a good idea, or should i try to generate an object id for each of my measurements?

[20:03:26] <mids> I'd generate an id myself

[20:04:53] <planrich> kk then ill checkout how i would do that

[20:04:55] <planrich> thx

[20:59:44] <jasonframe> hey guys, have a question about replica set setup. We have a set with two servers running primary / secondary and an aribter running on an app server

[20:59:49] <jasonframe> we are going to add another app server under a load balancer, what is the best practice with the arbiter

[21:04:58] <linsys> jasonframe: you want an odd number of votes so currently you have 3 votes, don't add another arbiter

[21:06:10] <jasonframe> I was thinking keeping the single arbiter on app server 1, and none on app 2, just needed second opinion for sanity

[21:06:42] <linsys> jasonframe: get rid of the arbiter and deploy 3rd mongodb replica set member...

[21:06:48] <linsys> that is really the only other option as I see it

[21:15:30] <tystr> is anyone here using 10gen's monitoring service ?

[21:20:27] <BurtyB> tystr, I am and I imagine a lot of others are too

[21:20:45] <tystr> heh, that' what I assumed :)

[21:21:55] <tystr> what's the best practice for running the agent? should I run it on each node in my replica set?

[21:47:39] <godawful> I've got a degraded replica set: 2 nodes up, one node down. I'm replacing the borked node with a new one from backups.

[21:48:08] <godawful> Just wondering about the best procedure to add it without disturbing the surviving nodes: Remove the dead one first, then add? Or just add it?

[21:48:34] <godawful> I'm worried that the primary's going to unelect itself if the voting changes or something...

[21:49:08] <ranman> godawful: just add it then remove the old one is my suggestion

[21:49:21] <ranman> or you can add it as the old one if you have it on ec2 or something

[21:49:38] <ranman> (change hostname, etc.)

[21:54:29] <godawful> So if I change dns so that the dead box's hostname points to the new box, it will just get picked up?

[21:54:39] <godawful> do the config servers poll or something?

[21:59:45] <Epona> hello

[22:00:03] <rockets> Is there any issue with doing a mongodump on a 1.8.x database and then importing to a 2.x database?

[22:02:54] <Epona> hey rockets

[22:03:02] <rockets> Hey Epona!

[22:03:10] <Epona> do you know much about mondo configuration file locations?

[22:03:14] <Epona> mongo*

[22:03:19] <rockets> Epona: I'd say 100% nothing :)

[22:03:23] <rockets> I'm just learning mongo

[22:03:24] <Epona> im used to postgres

[22:03:27] <rockets> although im technically in charge

[22:03:39] <Epona> some sites are saying its all done through the command line

[22:03:46] <Epona> which is a bit daunting

[22:04:16] <rockets> Epona: well yes, I'd recommend managing all your linux/unix services through command line

[22:04:20] <rockets> you wont really learn what youre doing

[22:04:23] <rockets> without using the command line

[22:04:51] <Epona> sometimes I dont want to learn what I do though

[22:05:06] <Epona> heh

[22:05:14] <rockets> Welp, when things break, best of luck trying to fix them :)

[22:11:32] <godawful> ranman: changing hostname worked fine thanks!

[22:11:54] <ranman> godawful: happy that works :)

[22:13:03] <ranman> godawful: and yes the RS does poll: http://www.mongodb.org/display/DOCS/Replica+Sets

[22:35:45] <tystr> anyone have a sysvinit script for mms-agent by chance?

[23:03:23] <tystr> :(

[23:10:10] <benop> anyone able to help me with a weird connection error?

[23:10:34] <benop> MongoDB.Driver.MongoConnectionException: Unable to connect to server: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full

[23:10:44] <benop> the server in question has only 24 open connections

[23:12:25] <dstorrs> so, with update I can use $inc to atomically update one element, even an array item. Is there a way to say "here's a array of values, atomically add them to the corresponding array values at 'key' ?

[23:13:53] <dstorrs> so, if the docs is { views_by_day : [ 0, 1, 2, 3 ] } and I do add the array '[3, 4, 5, 6]', I should be left with a doc that says { views_by_day : [ 3, 5, 6, 9 ] }

[23:34:52] <dstorrs> with update(), how do the '$set' and '$inc' operators interact with upsert : 1 ?

[23:35:29] <dstorrs> if the record does not exist, does it get init'd, then '$inc' treats unset values as 0 ?

[23:36:03] <dstorrs> what if the record exists, but has a slightly different structure than what you're updating? (keys missing / extra keys added)

[23:40:22] <ferrouswheel> dstorrs, yes

[23:40:40] <ferrouswheel> for your previous question you might be able to use the positional modifier: http://www.mongodb.org/display/DOCS/Updating#Updating-The%24positionaloperator

[23:40:52] <ferrouswheel> but I don't know how $inc and the $ positional interact

[23:41:19] <ferrouswheel> upserts for inc assume 0

[23:41:58] <dstorrs> ferrouswheel: *cough* "The positional operator cannot be combined with an upsert since it requires a matching array element. If your update results in an insert then the "$" will literally be used as the field name."

[23:42:05] <dstorrs> from the link you just pasted. :>

[23:42:42] <ferrouswheel> okay, I was referring to your previous question where you didn't specify you needed an upsert.

[23:44:07] <russfrank> is there a better way to extract frequency information on a particular field than some group query?

[23:46:19] <halcyon918> is there a way (or even a need) to use the repset name when connecting to mongo via the java driver? (our DBA told me "from the app, you can use repset name" but I don't see anything to that effect in the online docs)

[23:51:27] <dstorrs> ferrouswheel: ah, ok. my bad.

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 27th of June, 2012