PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 27th of February, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:26:01] <garthkerr> The ubuntu installation repo found in the instructions for 3.0 is failing.
[01:26:18] <garthkerr> echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
[01:26:28] <rkgarcia> what's the error?
[01:26:34] <garthkerr> W: Failed to fetch http://repo.mongodb.org/apt/ubuntu/dists/trusty/mongodb-org/3.0/multiverse/binary-amd64/Packages 404 Not Found
[01:26:46] <garthkerr> It seems to have moved under /testing/
[01:27:07] <rkgarcia> may be, it's RC
[01:27:07] <garthkerr> 14.04
[01:29:10] <garthkerr> This works:
[01:29:12] <garthkerr> echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/testing multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
[02:35:07] <MLM> I want to implement an ordered list. I have a model with an integer `order` field. What would be the best way to update each successive row/document when an item is moved around?
[02:37:27] <GothAlice> MLM: https://gist.github.com/amcgregor/4361bbd8f16d80a44387 < a full taxonomy model mix-in (Python, but demonstrative)
[02:38:41] <GothAlice> Most append/prepend after/before calls come down to a call to insert, here. There's maths. And likely race conditions. And it's not optimized for the case of moving something within a set of siblings.
[02:38:53] <GothAlice> (It's proof-of-concept.)
[02:38:58] <MLM> GothAlice: Thanks, where is `self.reload`?
[02:39:12] <GothAlice> That's a method provided by MongoEngine. It returns a fresh instance of self.
[02:39:20] <GothAlice> (With all new data from MongoDB.)
[02:39:36] <MLM> Ahh, I bet Mongoose has something similar
[02:39:39] <GothAlice> If it's missing, it's probably in MongoEngine. ;)
[02:39:54] <GothAlice> MONGOOOOOSE
[02:40:28] <MLM> Something better than it? (just checking)
[02:40:40] <GothAlice> Raw MongoDB? :hopeful smile:
[02:41:18] <GothAlice> You'll notice bits of this code drop down to raw PyMongo, too. (Basically anywhere there's curly braces…)
[02:42:24] <MLM> I am not keen either way. I am not sure what Mongoose offers over the vanilla driver: https://www.npmjs.com/package/mongodb
[02:42:53] <GothAlice> In the last week I've assisted six individuals with Mongoose-related injuries. :|
[02:43:35] <MLM> Does the npm mongodb package have a promise based API?
[02:43:45] <MLM> (not finding anything atm)
[02:44:23] <GothAlice> I don't know.
[02:44:45] <MLM> Looks like some people did make wrappers: https://www.npmjs.com/package/mongodb-promises, https://www.npmjs.com/package/mongodb-promise
[02:45:09] <GothAlice> Sounds… *drumroll* promising?
[02:47:51] <MLM> Ye, I don't see any difference in API to Mongoose so why not. Plus promises are awessome and this package seems pretty up to date: https://www.npmjs.com/package/mongodb-promises
[02:49:36] <MLM> Should I add a bool to items to add a lock to address race conditions?
[02:52:37] <GothAlice> That could help, as long as you have a sane retry and failure policy in the event of a long-term conflict.
[02:52:56] <GothAlice> Backup plans for backup plans, there. ;)
[02:56:08] <MLM> hmm, If you have any good articles on the retry/failure topic, I'm interested
[02:56:21] <MLM> *know of
[02:56:30] <GothAlice> Well… it comes down to how long are you willing to let a client request wait.
[02:56:46] <GothAlice> And checking to see if a lock isn't freed in a reasonable amount of time, and forcing it open if need be.
[02:57:09] <GothAlice> See: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/
[02:58:49] <GothAlice> Often things like this may require background processing, often on a schedule. https://gist.github.com/amcgregor/4207375 is a model for this, including sample code, sadly in Python. :)
[02:59:07] <GothAlice> (Document samples, though, are pure MongoDB.)
[02:59:17] <MLM> Ye I have read a bit about the job queue problem with databases. This is just a ordered list that I want to make
[02:59:57] <MLM> No processing but I could see a conflict if multiple requests are made to reorder the same list. Order indexes could get jumbled
[03:00:04] <MLM> or corrupted
[03:01:46] <GothAlice> You effectively need to two-phase a $inc for a range of documents (siblings after the insertion point) and a $set of the document into its position.
[03:03:20] <GothAlice> With a unique key on (parent, sequence_no) or such. In the event the $set fails, roll back the "transaction" ($inc: -1 things back). Mid-"transaction" here the sort order will be otherwise unaffected.
[03:04:54] <MLM> Would it be wise to use the latest request to re-order? And of course debounce to every second or so if a constant stream of re-ordering requesting are made?
[03:05:45] <MLM> nevermind because the re-ordering is only per item. not bulk movement
[03:06:37] <GothAlice> Indeed.
[03:07:37] <GothAlice> There are also subtleties to how you count records when inserting with offsets from the "bottom" of the list instead of "top".
[03:09:22] <MLM> I think I will just allow a user to set a new order index of an item by id. Then just update/shove the other items of the list into place.
[03:11:23] <GothAlice> I have pretty drag-and-drop re-ordering in my MongoDB CMS that uses this type of arrangement, with an API of alterChildIndex(oldIndex, newIndex).
[03:12:22] <GothAlice> I'll be moving to the taxonomy API I linked above for a more nuanced approach, once I armour it against and verify race conditions, in the above way. ;)
[03:19:12] <GothAlice> It's more of a defensive coding practice. I'll see if I can dig up some links.
[03:19:20] <GothAlice> Er, that was a wrong window!
[03:21:05] <MLM> In Mongoose you define Model Schemas. Since I believe Mongodb is loose with its fields, there is no enforcement there right?
[03:21:32] <MLM> (no enforcement with the raw driver) to get back a certain object shape with properties and defaults if non-existent
[03:56:32] <GothAlice> MLM: Your first set of statements is correct.
[03:57:03] <GothAlice> During certain operations, one could in theory provide default values if extant values do not exist, such as during an aggregate query $project operation.
[03:57:20] <Freman> hey folks
[03:57:30] <Freman> i'm having some troubles with mongo 3.something
[03:58:00] <Freman> http://pastebin.com/ebfBfiDB
[03:58:24] <Freman> as you can see, both transactions are upserts...
[03:58:42] <Freman> the one that fails is the first one for T8WA3j9EE2ESzN53PzH7, the second one succeeds
[03:59:01] <Freman> (first one is logged as 'saving log' second one is logged as 'appending log')
[03:59:04] <joannac> what the hell is that?
[03:59:16] <joannac> that's not a mongod log
[03:59:26] <Freman> no, its log from our node app
[04:00:02] <Freman> update(find, update, options) was just logged as console.dir([find, update, options])
[04:00:22] <joannac> race condition?
[04:00:49] <Freman> quite probably, the distance between the two attempts to write is just 100ms...
[04:01:30] <Freman> but shuoldn't mongo be able to handle that gracefully?
[04:03:53] <Freman> didn't have this problem till yesterday when we upgraded
[04:04:45] <Freman> I meaan, it shouldn't take long to find by id...
[04:04:47] <joannac> what does the final document look like?
[04:06:54] <joannac> how do you know which one succeeded?
[04:07:09] <joannac> what do the mongod logs say?
[04:09:31] <Freman> I know that the second one succeeded because of where they are and what the error message is set to on callback
[04:09:54] <Freman> $pushes only happen on append, the first $set is 'save'
[04:11:51] <Freman> http://pastebin.com/gA4EcN1A i s the final document
[04:14:11] <Freman> psudo code for the task looks like function startlog() {mongo.update(..., function(err) {'Problem saving log'})}; function appendlog() {mongo.update(..., function(err) {'problem appending log'})};
[04:14:30] <joannac> mongod logs?
[04:14:35] <Freman> they're both being called about 100 ms apart but the second call is the one that succeeds becuase it's the first one that complains
[04:14:38] <joannac> oplog?
[04:15:25] <Freman> I'm finding the mongod logs now...
[04:16:36] <Freman> wow, that is a big log file
[04:23:19] <Freman> there is no mention of T8WA3j9EE2ESzN53PzH7 in the log
[04:25:10] <Freman> nothing that spits Problem saving log MongoError: E11000 duplicate key error index: logsearch.log_outputs.$_id_ dup key: { : "T8WB63bTLGwQwdAnk68f" } out shows up in the log
[04:26:02] <Freman> oh no, I found some
[04:26:08] <Freman> I searched for the error not the key
[04:31:03] <Freman> similar event logs http://pastebin.com/xgNnTBDr (sorry, couldn't find the exact event)
[04:34:53] <Freman> some times (more rarely) the conflict seems to happen the other way - appending loses out to saving
[04:36:53] <joannac> need more logs
[04:38:28] <joannac> actually... are you using the mongodb nodejs driver?
[04:39:01] <joannac> if so what version?
[04:43:10] <Freman> yes, should be... latestish, just a sec
[04:44:07] <Freman> "version": "1.4.31"
[04:50:18] <joannac> Freman: http://docs.mongodb.org/manual/reference/method/db.collection.update/#use-unique-indexes
[04:50:33] <joannac> what you see is expected if you're going to use upsert:true
[04:50:47] <Freman> so I should probably ignore it?
[04:50:57] <joannac> no, you should catch the error and handle it
[07:03:46] <Freman> I got rid of the entire startlog, now the append and end functions are fighting
[07:10:39] <Freman> ok, not even upserting any more
[07:16:51] <Freman> is it perhaps something in my indexes it doesn't like
[07:35:59] <Freman> meh
[07:36:01] <Freman> until monday
[08:29:52] <flusive> hi all
[09:49:11] <morenoh149> How much data would it cost to record all irc communications on the freenode network?
[09:54:40] <aslomowski> hi lads
[09:55:28] <aslomowski> I have a problem, with quering and deleting, how can i truncat array in subdocument witch is also in array ?:>
[09:55:33] <aslomowski> This is my scheme:
[09:55:41] <aslomowski> http://pastebin.com/ksW0F8wj
[09:56:24] <aslomowski> its a bit hard query I think
[09:56:38] <aslomowski> I have a problem with arrays all the time
[09:58:43] <morenoh149> aslomowski: what's `schema` line 7?
[09:59:04] <aslomowski> timeTableSubSchema
[09:59:10] <aslomowski> i would like to truncat it
[09:59:23] <aslomowski> its subdocument in schema ;p
[10:03:05] <morenoh149> aslomowski: if `schema` on 7 were the schema of timeTableSubSchema then it should have the hour, minutes and type fields as well no?
[10:03:31] <flusive> GothAlice, http://picpaste.com/htop-YLxZbetZ.png do you think is still ok?
[10:03:35] <morenoh149> there are two object's in that paste. What are they?
[10:04:50] <aslomowski> morenoh149: hmm...
[10:05:50] <aslomowski> morenoh149: I dont understand. Hmm.. timeTableSubSchema i just subdocument of schema.stops.timetable
[10:06:10] <aslomowski> I could do it in one scheme but I had problem with accessing to fields
[10:06:19] <aslomowski> that I have read that I should split them
[10:06:23] <morenoh149> ohh I see
[10:06:23] <aslomowski> and make subscheme
[10:06:54] <aslomowski> and now I need to get access there and truncat
[10:07:01] <aslomowski> before I will push updated details
[10:07:04] <morenoh149> aslomowski: you should split if nesting will make the document grow larger than 16mb
[10:07:35] <aslomowski> nvm, can I truncat this table in any way?
[10:07:42] <morenoh149> so want to flatten both? why
[10:07:52] <aslomowski> like schema.stops[i].timetable.length(0)
[10:08:03] <aslomowski> basicly I had tried that but it did not work
[10:08:39] <morenoh149> Why do you want to truncate it? deletion?
[10:09:02] <aslomowski> yep. I would like to update this for demand of user
[10:09:21] <aslomowski> i would like to replace this by new object
[10:09:39] <aslomowski> maybe I could do it,by not using .push()?
[10:10:00] <aslomowski> coz that how I push json right now
[10:10:04] <aslomowski> into that array
[10:10:16] <morenoh149> .length(0) <-- what did you expect this would do?
[10:10:37] <aslomowski> make length of array equal to 0
[10:10:43] <aslomowski> means delete everything
[10:11:46] <morenoh149> aslomowski: in what library? that's not a mongoose thing nor a javascript thing
[10:11:59] <aslomowski> i have found it in google
[10:12:08] <aslomowski> but as I said it didnt work :)
[10:12:32] <morenoh149> arrays have .length() which returns the length of the array. There is no setter .length(n) in js nor mongoose.
[10:12:53] <morenoh149> What driver are you using? what language are you programming in?
[10:12:59] <aslomowski> javascript
[10:13:13] <morenoh149> are you using mongoose?
[10:13:13] <aslomowski> with node and express if its important
[10:13:16] <aslomowski> yep
[10:13:25] <morenoh149> there's channel #mongoosejs
[10:14:01] <aslomowski> ah ok :)
[10:14:21] <aslomowski> so ill have to explain everything once again
[10:14:34] <morenoh149> yes. don't be lazy
[10:15:02] <aslomowski> actualy I am:)
[10:15:10] <aslomowski> and fortunetly I found solution
[10:15:17] <aslomowski> instead of length(). just = []
[10:15:20] <aslomowski> heh
[10:15:33] <morenoh149> aslomowski: yeah read the docs devdocs.io!
[10:15:42] <aslomowski> devdocs.io?
[10:15:48] <aslomowski> is it interesting?
[10:19:14] <pamp> hi
[10:19:27] <pamp> i need help with an operation
[10:19:39] <pamp> i want disaggregate a collection
[10:19:58] <pamp> like in this example
[10:20:00] <pamp> http://dpaste.com/3KHESTM
[10:20:32] <pamp> i can do it with aggregation and "$out"
[10:20:44] <pamp> and can generate a new id
[10:20:59] <pamp> but cant use the original id's os collection
[10:21:41] <pamp> someone already had the same problem?
[10:22:27] <pamp> if i project a new collection without the _id's the original this works
[10:22:37] <pamp> ´but i need this _id's too
[10:45:21] <brano543> Hello, i would like to use an external index library with mongo. Can you tell me how am i supposed to do it? Just store the id in both places,in the library and in mongo?
[10:47:03] <brano543> The library i want to use is libspatialindex, right now i am able to insert into the index, but i would like to use mongo as the database backend for this index.
[10:48:25] <rymate1234> noob question - I have a mongodb collection with a date field - example field is "date" : ISODate("2015-02-27T10:40:14.962Z"). How would I query the database to get items from the last hour?
[10:50:09] <brano543> rymate1234: as far as i know, mongo is object database, you can´t just use functions like in relational databases. i guess you will need to "explode" that date into separate fields.
[10:50:43] <obeardly> brano543: your question makes no sense, MongoDB if fairly optimized for what it does, why screw with it? My only suggestion if you are looking for key/value store, and nothing else, you could look at AeroSpike, but AFAIK as a DB, you can't beat defacto MongoDB
[10:51:56] <brano543> obeardly: maybe you misunderstood me. i don´t want to replace mongoDB. i just need to use the r-tree spatial index from libspatialindex.
[10:52:44] <obeardly> brano543: quite possible, I'm at the end of my day, I'll reread
[10:53:50] <brano543> obereadly: You said MongoDB is optimized,i hope so, because i want to create edge-exapanded graph representations and use it for routing
[10:54:40] <brano543> obereadly: i mean each document will have list of neighbours with their cost values.
[10:55:17] <obeardly> brano543: My best guess, if if your question exceeds the basis of MongoDB capability, it's beyond my ability; I do know that MongoDB has the ability to use C libraries, but it's beyond my expertise, apologies for speaking out of line
[10:56:40] <Tobatsu> Hi. One question.. I've been told that MongoDB is somehow based on Hadoop. Is that correct information?
[10:56:55] <brano543> obereadly: i just asked for curiosity, it would be great to serve spatial queries. For my purpose right now,i will just do with using the R* tree library temporary to preprocess the data and insert into mongoDB.
[10:57:33] <obeardly> rymate1234: your question is a little out of my wheelhouse, but I found this:
[10:57:33] <obeardly> http://stackoverflow.com/questions/18233945/query-to-get-last-x-minutes-data-with-mongodb
[10:58:04] <rymate1234> huh, that doesn't look too hard
[10:58:12] <rymate1234> just need to figure out how to do the same with python
[10:58:17] <obeardly> Tobatsu: you're comparing Hummers vs Passenger cars
[10:59:03] <Tobatsu> obeardly: So I thought too but I just wanted to be sure. :)
[10:59:16] <brano543> is here someone who can tell me how to use external index library libspatialindex with mongoDB? :)
[10:59:34] <obeardly> Tobatsu: think of Hadoop as a Hummer, it can carry a lot of data, but it's kind of slow, where passenger cars carry a smaller payload, but are much faster
[10:59:40] <brano543> there is really lack of information about something like this
[11:00:11] <obeardly> brano543: again, sorry I can't help you
[11:01:43] <Tobatsu> So. Different thing. Thx :)
[11:01:46] <obeardly> Be easy my friends, true happiness lies within ones self, and not their existentialism
[11:07:19] <brano543> If anyone knows someone who might be able to help me with using external spatial library libspatialindex with MongoDB or clarify me the concept how it may work,please let me know, i will stay on the channel :)
[11:20:51] <jsfull> hi guys, everyday my crawler collects a list of top 20 urls from google for particular keyword, the results change often and what i want to find out is how volatile the result is for today, so my question is, how do i calculate the volatility/change %, wondering what math equation i should use- input on this would be really helpful
[11:47:34] <zivix> jsfull: I think that largely depends on your definition of "volatility". But as a starting point I'd use the rank delta as a measure, and you can create a scale from less to more volatile based on how often the rank changes or how big that change is. Overall volatility would be a function of those two datapoints over time, imo.
[11:48:31] <brano543> If anyone knows someone who might be able to help me with using external spatial library libspatialindex with MongoDB or clarify me the concept how it may work,please let me know, i will stay on the channel :)
[12:06:28] <d0x> Hi, is there a way to get the calendarweek of a date in mapreduce?
[12:06:44] <d0x> Or do i need to calculate it myself?
[12:08:38] <d0x> in the aggregation framework there is the $week option for that
[12:25:50] <brano543> If anyone knows someone who might be able to help me with using external spatial library libspatialindex with MongoDB or clarify me the concept how it may work,please let me know, i will stay on the channel :)
[12:38:51] <jsfull> thanks zivix, good point - will give that a try :-)
[13:01:15] <brano543> If anyone knows someone who might be able to help me with using external spatial library libspatialindex with MongoDB or clarify me the concept how it may work,please let me know, i will stay on the channel :)
[13:29:45] <aliasc> hi all
[13:34:23] <StephenLynx> sup
[13:34:34] <aliasc> how large can a document be, seriously
[13:34:42] <aliasc> i plan on embedding large arrays
[13:34:49] <aliasc> very large hundreds of thousands
[13:35:01] <aliasc> for each record\
[13:35:18] <aliasc> which can sum up to millions
[13:36:14] <brano543> aliasc: official documentation says 16 MB,you should read it, if you want bigger you need something called GridFS
[13:36:21] <brano543> If anyone knows someone who might be able to help me with using external spatial library libspatialindex with MongoDB or clarify me the concept how it may work,please let me know, i will stay on the channel :)
[13:37:01] <aliasc> what happens if the limit is exceeded
[13:37:04] <StephenLynx> 16mb
[13:37:17] <aliasc> i can't ask the documentation what i want to know :) so thats why i'm here
[13:37:34] <StephenLynx> probably it throws an error if you try to add stuff that exceeds this limit.
[13:38:04] <brano543> StephenLynx: Hello, can you help me with my issue of using libspatialindex as secondary index to MongoDB. what are the practices?
[13:38:18] <StephenLynx> i don't even know what is a spatial index :v
[13:38:38] <aliasc> probably. hm we are talking about valuable data here, i can't rely on probably
[13:38:51] <StephenLynx> run a test then.
[13:39:12] <brano543> StephenLynx: it is a tree where you can store geographical coordinates and they are grouped according to the Minimum-bounding-rectangle approximation of each area.
[13:39:13] <StephenLynx> make text out of files and keep adding it to a document
[13:39:27] <aliasc> currently i structured my data like in RDMBs model with ids linking to documents but thats not what mongodb is right
[13:39:30] <d0x> Is there a way to call db.xxx.find(...) inside a map function of an Map Reduce task?
[13:39:47] <aliasc> i want to embed it all
[13:39:47] <StephenLynx> aliasc if you do that for EVERYTHING, you are doing it wrong.
[13:40:07] <StephenLynx> if you embed EVERYTHING, you probably are not 100% either.
[13:40:20] <aliasc> not for everything, im separating data to avoid limitations
[13:40:32] <StephenLynx> yeah, that is ok.
[13:40:38] <StephenLynx> if you don't need to make multiple queries
[13:40:54] <aliasc> a collection can have thousands of documents but a document in that collection can have hundreds of thousands of arrays
[13:41:21] <aliasc> what i did is sepparate data in its own collection for arrays that might get long
[13:41:28] <StephenLynx> yeah, I do that too.
[13:41:29] <aliasc> and just link them
[13:41:47] <StephenLynx> and I keep embedded stuff that I am sure that won't grow too large.
[13:42:14] <brano543> StephenLynx: Nevermind,you may be able to answer me second question. I want to store adjacency list in MongoDB where each record contains neighbor ids and travel costs. Is it a good idea to let mongodb actually query the DB like this and find a route by known algorithms or is it slow?
[13:42:22] <StephenLynx> for example. I make a separate collection for threads in a forum and posts in a thread. but I keep embedded the list of mods for the forum.
[13:42:28] <aliasc> exactly. and the thing i love is i program classes the way i want my collection to be and just insert the object of the class
[13:42:33] <aliasc> isnt that wonderful :)
[13:42:53] <StephenLynx> brano543 I would advice having a look into databases built for graphs.
[13:43:04] <StephenLynx> someone mentioned on here yesterday
[13:43:19] <StephenLynx> centerDB?
[13:43:26] <StephenLynx> it started with C, I am sure of that.
[13:43:32] <StephenLynx> advise*
[13:44:04] <brano543> StephenLynx: I looked into that,but i need my own route algorithm,i am doign a specific task :)
[13:44:32] <StephenLynx> yeah, I have zero experience with that. what I know is that mongo does not have any tool to relate objects and collections.
[13:44:54] <StephenLynx> so if your work relies intensively into these relations, you should probably not use mongo.
[13:46:13] <brano543> StephenLynx: I asked about the MongoDB performance. i can make those relations,thats not problem for me,adjacency list should do it. So you suggest using mongo as temporal storage,then load the graph into main memory server and run the routing algorithms there?
[13:46:47] <StephenLynx> I suggest using a graph database in the first place.
[13:47:02] <StephenLynx> because I found out yesterday they exist.
[13:47:24] <StephenLynx> mongo performance is good for reading and writing documents, that I know, though.
[13:47:39] <StephenLynx> and it can handle large volumes.
[13:48:09] <StephenLynx> but the thing is, these routing operations would have to be performed by your application
[13:48:55] <StephenLynx> if you use a graph database you will have these operations done in a tool specialized for that and at the same time your application will not have to perform any actions in this regard.
[13:49:07] <brano543> StephenLynx: i asked this school,if they could send me a paper what were they actually doign last year,but they didnt reply me http://cs.brown.edu/research/pubs/theses/capstones/2014/chen.andersen.pdf
[13:49:48] <StephenLynx> so even if mongo's raw performance is higher than this database, your operation total performance might be lower because you will be doing two jobs for a single operation if you use mongo for it.
[13:49:58] <brano543> StephenLynx: well, as fas as i know,there isn´t a graph database where i can specify my own traversal,at least its very hard, i have to nderstand Java which i don´t know.
[13:50:12] <StephenLynx> java is easy.
[13:50:28] <brano543> i know c++ and python
[13:50:34] <brano543> php
[13:50:42] <StephenLynx> if you know C++ then java is a walk in the park.
[13:50:59] <StephenLynx> I know because I have studied C++ and I'm mainly a java developer.
[13:51:58] <StephenLynx> you shouldn't avoid a tool because of now being familiar with java.
[13:52:06] <StephenLynx> of not*
[13:53:16] <brano543> StephenLynx: I understand, but i do it as my bachelor thesis and i even got asked by some bus travel company that they are interested in my product :)
[13:53:34] <StephenLynx> so?
[13:56:11] <brano543> StephenLynx: In those graph databases there are already implemented functions for shortest path which i don´t need. I can create as simple thing as graph representation myself. The point thing of database is that it keeps persistance of data and can handle many clients. I also looked at project Openstretmap Routing Machine and i am not sure how tehy server is implemented,the code is really huge
[13:57:28] <StephenLynx> so the problem is that you need to implement a custom logic for the paths and you believe these graph databases don't allow you to?
[13:58:30] <brano543> StephenLynx: My problem is i am not sure how to deal with creating multithreaded server to use my custom program to handle queries.
[13:59:16] <StephenLynx> so the problem is with the server that will be handling the http requests?
[14:00:00] <StephenLynx> if you have to handle many concurrent requests, I suggest io.js.
[14:00:16] <StephenLynx> if your application is CPU intensive, Go.
[14:00:55] <GothAlice> I recommend Python, of course. ;)
[14:00:58] <StephenLynx> io.js deals with concurrency by having a thread per CPU core, so if you perform cpu intensive operations, it will not be really good.
[14:01:06] <GothAlice> We all have our biases. ;)
[14:01:10] <StephenLynx> I have data.
[14:02:00] <StephenLynx> python still is about 20x slower than V8 and uses much more RAM because it uses threads to deal with concurrency.
[14:02:09] <StephenLynx> lunch
[14:03:37] <brano543> StephenLynx: okay guys, listen, lets not waste time, i will try to use mongodb for routing and if that doesn´t work out i will create in-memory representation of graph,lets say use STXXL:map for storing it, then find the path using custom traversal and get additional information about nodes from MongoDB when the route is finished. That is also a way.
[14:04:50] <Cygn> Hey guys, how to handle big insert requests in mongo db?
[14:05:16] <Cygn> Is there a way to import something equivalent like an .sql file in MySql?
[14:05:29] <GothAlice> Cygn: mongoimport
[14:05:30] <Folkol> Yes, you can export and import data-.
[14:06:06] <brano543> StephenLynx: i will let you know in few days how i proceed, right now i am solving approximation of points using Boost:envelope minimum bounding rectangle :)
[14:06:06] <GothAlice> See also mongodump/mongorestore. These read and write MongoDB BSON files, mongoimport/export handle other, simpler formats like CSV.
[14:06:33] <Cygn> I already wrote it in mongo language so mongorestore could be the right way, thanks guys !
[14:07:38] <GothAlice> StephenLynx: Even our data disagree. https://gist.github.com/amcgregor/707936#file-terminal-3-top — in the middle of processing 10,000 concurrent requests memory usage is ~10MB. (This is without threading. Threading adds ~2-4MB.)
[14:08:29] <brano543> StephenLynx: Have a nice day.
[14:10:37] <Cygn> Another Question guys, i had to use the Date() Function of mongo, thats why i wrote my whole import as mongodb request (db.collection.insert …) can i also import this statement as a file?
[14:11:32] <GothAlice> Cygn: Yes. I don't recall MongoDB's preferred date format for import, but you could find out by exporting some data involving dates first, then having a gander. :)
[14:12:09] <Cygn> GothAlice: Yeah, my problem is i did that and it exports .bson which is not readable :/
[14:12:36] <GothAlice> That's a dump, not an export. Try the "mongoexport" tool instead, and make sure to select a format like JSON, or CSV (if appropriate).
[14:17:06] <Cygn> GothAlice: It exports like { "$date" : "2014-01-04T00:00:00.000+0000" } … but that is my problem, my importdata has a datestring "2014-01-04" which i want to have imported using the Date() Funciton (which i could do in a normal mongo statement.
[14:17:55] <GothAlice> Yeah, that won't work. Data != code. You'll need to reformat those dates in the proper ISO format, wrapped in {$date: } to mark it as such.
[14:18:00] <d0x> Is there a way to see all methods of an array in mongodb?
[14:18:06] <d0x> I mean in the Javascript env.
[14:18:36] <GothAlice> d0x: somearray.<tab><tab>
[14:18:49] <GothAlice> Pressing tab for auto-completion twice will list the possibilities.
[14:18:51] <d0x> ouch
[14:18:53] <d0x> that hurts
[14:18:57] <d0x> tank you :)
[14:19:06] <Cygn> GothAlice: i think i will just insert it as a string, and afterwards update entity of the collection …
[14:19:13] <Cygn> GothAlice: Thanks for your help !
[14:31:16] <pamp> hi guys
[14:32:17] <pamp> i want to create an index, but the key is too large to index
[14:32:51] <pamp> whats the difference in performance, by create an hashed or text index
[14:32:57] <pamp> ?
[14:38:35] <StephenLynx> Considering how fast a hash function can be, I don't think it would have a severe impact.
[14:43:31] <StephenLynx> GothAlice that benchmark means little. First, I don't know what code is running behind the test. Second it does not compares to anything. Third I don't know the hardware. Fourth, I don't know how much your server software deviates from the standard python webserver.
[14:48:30] <nir> hi there
[14:49:15] <nir> I have the following collection http://pastecode.org/index.php/view/1ce4d941, and I don't know how to write my query to find all products that have a specific set of "filters"
[14:51:33] <nir> i also try to find products that have a specific filter but that is not working : db.products.find( {"filters": {$elemMatch: {"identifier": "HOM"} }})
[14:52:44] <Cygn> I want to use the Value of a field inside a foreach, i know that i can pass it f.e. via ….forEach(function(e) { e.value } - but what to do if my column name is "1" ?
[14:53:04] <StephenLynx> nir elemmatch is used to get stuff from subarrays
[14:53:17] <StephenLynx> for your query to work, filters would have to be an array
[14:54:12] <StephenLynx> does your filter's sub-documents must have a field? it looks redundant having the "identifier" field
[14:54:30] <Cygn> If i try f.e. e.6 i get "unexpected number" if i try e."6" i get "unexpected String" :(
[14:54:32] <StephenLynx> instead of assigning the value of identifier directly to "sex" or category"
[14:55:32] <StephenLynx> anyway, you can just use dot notation
[14:55:43] <StephenLynx> "filters.sex":yourvalue
[14:55:49] <nir> StephenLynx I can't redesign this collection, it has been created like this by my client :(
[14:55:59] <sweb> how can i group by and count this collection http://paste.ubuntu.com/10449679/ via `tag special 1` ?
[14:56:06] <StephenLynx> filters.sex.identifier
[14:56:09] <StephenLynx> in this case.
[14:56:29] <StephenLynx> sweb aggregation's $group operator
[14:58:09] <pamp> StephenLynx: Hashed indexes doesn t work with array, which is my case.. Have much impact create an text index?
[14:58:10] <nir> StephenLynx but I don't know in advance all the keys of the filter collection
[14:58:47] <StephenLynx> but you know the filters you must use in the query?
[14:59:23] <StephenLynx> pamp no idea.
[14:59:26] <nir> no, I only have an array of filters
[14:59:33] <StephenLynx> nir so you know.
[14:59:46] <StephenLynx> how is the array?
[15:00:01] <StephenLynx> [sex:"x",stuff:"y"] ?
[15:00:07] <nir> ["HOM", "T29", "BLE"]
[15:00:22] <nir> don't have the "sex" and "stuff" part :/
[15:00:23] <StephenLynx> wtf
[15:01:00] <sweb> StephenLynx: could you please show me how ... i can find out strucure of aggregation group :(
[15:01:20] <StephenLynx> http://docs.mongodb.org/manual/reference/operator/aggregation/group/
[15:01:51] <StephenLynx> nir how is the search logic suposed to work?
[15:02:16] <StephenLynx> since you don't know which field must match which value?
[15:22:23] <sweb> some one help :( http://paste.ubuntu.com/10450136/
[15:24:17] <StephenLynx> you must first query to filter the results with a $match block and then use an acumulator operator, like push
[15:24:31] <StephenLynx> grouping in mongo doesn't do stuff automatically.
[15:24:56] <StephenLynx> take a look at their examples.
[15:28:36] <StephenLynx> the group block works similarly to the projection block, you give it a field and something to put in the field. except it can take accumulator operators , it required an _id on documents, and can't take booleans to just project or not the field.
[15:35:38] <Cygn> I use a collection called dataChunk, which has a column with the identifier "5", i need it to be 5, i wanted to do it that way: db.dataChunk.update({}, { $rename: { '"5"' : '5' } }), which does not adress the column "5" apparently, any suggestions?
[15:42:59] <StephenLynx> hm
[15:43:13] <StephenLynx> afaik, fields names are not typed.
[15:43:33] <StephenLynx> wait
[15:43:58] <GothAlice> http://bsonspec.org/spec.html < technically all keys, including array indexes, are string mappings.
[15:43:59] <StephenLynx> you created a field name with quotes?
[15:44:09] <StephenLynx> and you wish to remove the quotes?
[15:44:27] <Cygn> StephenLynx, actually i didn't wanted to but the json format does not allow a simple number.
[15:44:56] <StephenLynx> let me check that
[15:45:44] <GothAlice> {"27": 42} < not valid
[15:45:45] <StephenLynx> http://pastebin.com/evHtQJg8
[15:45:48] <Cygn> My Problem is i have a whole bunch of date values, that i need to push through the mongo Date() Function in the end. I don't care if i import them using csv, json etc. but after the import i need to address them in any way inside a foreach loop to push them through that function.
[15:46:36] <StephenLynx> do a regular find and show me the results
[15:46:48] <GothAlice> Interesting, JSON parses {"27": 42} fine, but a JS shell doesn't.
[15:47:05] <StephenLynx> { '5': 10 }
[15:47:10] <StephenLynx> node is parsing everything just fine
[15:47:13] <StephenLynx> io*
[15:47:59] <Cygn> http://pastebin.com/Sv8Fpuw0
[15:48:01] <GothAlice> "SyntaxError: Unexpected token ':'." Parse error. running under V8. Hmm.
[15:48:38] <Cygn> GothAlice: That is the point "24" is interpreted as "24" as column name, but in the end i need 24 (without quotes)
[15:48:42] <StephenLynx> Cygn that is expected.
[15:48:45] <GothAlice> Cygn: Sweet mother of the gods. Importing CSV data, or, what could possibly be the reason to use numeric keys like that?
[15:49:03] <StephenLynx> field names will always be strings.
[15:49:07] <StephenLynx> what you can do
[15:49:15] <StephenLynx> is to get the field name and add a + before it.
[15:49:22] <StephenLynx> so it will be parsed to a number.
[15:49:44] <GothAlice> (Or parseInt(foo, 10) it.)
[15:49:48] <FunnyLookinHat> Is there a way to take my administration user and grant it access to every db in the system ? Or do I have to manually create it for each database?
[15:50:11] <StephenLynx> GothAlice is any benefit in doing that instead of using the plus sign?
[15:50:55] <GothAlice> FunnyLookinHat: There are some roles that give server-wide permissions. I.e. readWriteAnyDatabase, userAdminAnyDatabase, dbAdminAnyDatabase.
[15:51:08] <FunnyLookinHat> GothAlice, Ah ok
[15:52:45] <GothAlice> StephenLynx: Obviousness, less likely to run into accidental addition (instead of unary plus casting), etc. Basically, using the functional instead of fancy approach is less likely to bite in exotic ways.
[15:53:26] <GothAlice> (Just remember to always supply a radix. 'Cause that's another gotcha.
[15:53:49] <FunnyLookinHat> GothAlice, when I try to update with a new set of permissions, it seems to automatically apply them to only a single database though? http://hastebin.com/ujaqufipaf.profile
[15:54:29] <GothAlice> FunnyLookinHat: User maintenance happens in the "active" database. Is your "admin" database active, or one of your target databases?
[15:54:50] <FunnyLookinHat> Ah - admin database
[15:55:00] <GothAlice> :)
[15:59:13] <Cygn> StephenLynx, GothAlice: i really try to get on track here, and solve the problem i have in any possible way, but … right now it would most likely fix everything if i could just at least adress the value inside my forEach statement, which seems impossible since i can neither write function(e) { e."4" } / function(e) {e.4} to adress it
[15:59:28] <GothAlice> e['4']
[15:59:32] <Cygn> aaaah ! :)
[15:59:35] <GothAlice> Remember: the keys are strings.
[15:59:54] <GothAlice> And in JS, object "attributes" or "properties" can also be accessed using array notation. :)
[16:00:03] <Cygn> GothAlice: Unexpected token [
[16:00:05] <StephenLynx> or e.4
[16:00:19] <Cygn> StephenLynx: Unexpected Number
[16:00:35] <Cygn> http://pastebin.com/frZrX9QP#
[16:00:37] <Cygn> http://pastebin.com/frZrX9QP
[16:00:37] <GothAlice> http://cl.ly/image/1O0O3M0z3j1D
[16:00:43] <StephenLynx> use console.log(JSON.stringify(object)) with the object you got back from the query
[16:00:46] <StephenLynx> and show me
[16:01:13] <StephenLynx> e is probably the error.
[16:01:26] <StephenLynx> and the object should be returned on the second parameter.
[16:01:32] <StephenLynx> from what I remember from each
[16:01:40] <Cygn> StephenLynx: Which query are we talking about?
[16:01:44] <Cygn> Since i just get an error
[16:01:46] <StephenLynx> the find.
[16:02:09] <StephenLynx> first of all
[16:02:17] <StephenLynx> find does not returns anything
[16:02:26] <StephenLynx> it expects an callback
[16:02:43] <StephenLynx> and passes the found elements as the second parameter of this callback
[16:02:48] <StephenLynx> second
[16:03:01] <StephenLynx> find don't return an array, but a cursor that points to objects.
[16:03:20] <StephenLynx> third
[16:03:31] <StephenLynx> formatting your code would be of great help
[16:03:34] <Veejay> Hi, is there a command to do a $pull on all documents of a collection?
[16:03:58] <StephenLynx> clear a collection of all its documents without droping it?
[16:04:24] <GothAlice> Veejay: db.collection.update({criteria}, {$pull: {…}}, {multi: True})
[16:04:38] <GothAlice> Veejay: See: http://docs.mongodb.org/manual/tutorial/modify-documents/#update-multiple-documents
[16:04:57] <GothAlice> In terms of "all" documents, use an empty search criteria.
[16:05:54] <Veejay> GothAlice: Awesome, thanks
[16:07:21] <Cygn> StephenLynx: The stringify result: http://pastebin.com/bYb309nj - to your other feedback, does that mean i can't use the foreach to the find function?
[16:07:38] <StephenLynx> you can, you can't use the way you are trying
[16:08:14] <Cygn> StephenLynx: Was actually based on a very high ranked StackOverflow Example… http://stackoverflow.com/questions/3788256/mongodb-updating-documents-using-data-from-the-same-document/3792958#3792958
[16:08:37] <StephenLynx> thats seems to be on CLI
[16:08:38] <GothAlice> … this data is flat, uses pseudo-numeric field names, and ignores native datatypes for integer and date values. Cygn: Your data fits a relational model far better than MongoDB.
[16:08:46] <StephenLynx> you are using javascript, aren't you?
[16:08:58] <Cygn> StephenLynx: i am using the mongodb console
[16:09:01] <StephenLynx> oh
[16:09:08] <StephenLynx> nevermind what I said then.
[16:09:23] <StephenLynx> it was in the context of javascript.
[16:09:49] <StephenLynx> can't see why you are so concerned then, what you do on CLI is barely valid on any language.
[16:09:54] <StephenLynx> aside from the logic, of course.
[16:10:43] <Cygn> StephenLynx: Concerned? I just need to transform the date fields, and i can't since i get a syntax error all the time.
[16:10:59] <GothAlice> Hmm, that SO post makes me want to try a $rename that moves a field into a non-extant new sub-document to see what would happen.
[16:11:34] <StephenLynx> Cygn but are you just doing that once on your CLI?
[16:11:47] <Cygn> StephenLynx: exactly just because my import data is malformed
[16:11:47] <StephenLynx> and you said you wanted to rename from a string to a number, but you can't do that.
[16:12:01] <StephenLynx> a field name will always be a string.
[16:12:12] <GothAlice> Ignoring the fact that console doesn't exist in the MongoDB shell, http://cl.ly/image/2c0h1B1w0N0z < totally works
[16:13:22] <GothAlice> Sounds like what you're pining for, though, is mysql_fetch_array. :/
[16:13:23] <StephenLynx> HAHA indeed, if the field name can be parsed to a number, you won't be able to reference it in dot notation
[16:13:53] <GothAlice> StephenLynx: The trick seems to be the first character being numeric. If it's alpha instead, the extra digits don't matter.
[16:14:31] <StephenLynx> omg, true. tried to do foor.5x and it didn't worked either :v
[16:14:45] <StephenLynx> js inconsistencies are hilarious.
[16:18:00] <GothAlice> Well, it's just the identifier naming convention. [a-zA-Z_][a-zA-Z0-9_]* — that's a pretty common pattern, same as Python's actually.
[16:18:03] <Cygn> GothAlice: This just worked… i don't get really why but it did...
[16:22:50] <Cygn> Thank you both very much for now… especially for the answers to the confused questions.
[16:23:24] <GothAlice> It never hurts to help. :)
[16:23:48] <GothAlice> But yeah. That data's not a very good fit for MongoDB in its current form. ¬_¬
[16:25:42] <Cygn> GothAlice: Why do you think so?
[16:25:56] <StephenLynx> the field names are awful, imo.
[16:25:58] <GothAlice> … this data is flat, uses pseudo-numeric field names, and ignores native datatypes for integer and date values. Cygn: Your data fits a relational model far better than MongoDB.
[16:26:53] <GothAlice> mysql_fetch_array — get back the record as an array, with actual numeric separation, in PHP-land, at least.
[16:27:40] <GothAlice> "6" is meaningless. "created" has meaning, as an example.
[16:27:50] <Cygn> GothAlice: I need to have the functionality to add columns without having a fixed data structure, the data will in addition never be fetched, it will just be counted using the columns as criteria. I could only use postgres and the json functionality.
[16:27:53] <Angie-98> anyone uses robomongo?
[16:28:36] <StephenLynx> Cygn and why you have to use numbers to that?
[16:28:51] <GothAlice> Cygn: MongoDB already lets you add arbitrary fields, and have those fields differ between documents in the same collection (though do this with an appropriate dose of caution). Nothing you've described suggests to me a numeric field name approach.
[16:29:04] <StephenLynx> if its a list of aditional anonymous data, you can just have a subarray and throw them there.
[16:30:45] <Cygn> StephenLynx: actually this is just an idenitifier used by the infracstructure of the whole company workflow. Each Sale has some attributes, all atrributes have an id which identifies them. In the whole system 1 means the market for example.
[16:31:00] <GothAlice> Cygn: Also a candidate for EAV (entity attribute value) storage.
[16:31:23] <StephenLynx> yeah, you can just have a subarray in the document that represents the sale and throw this data there.
[16:31:58] <Cygn> Stephenlynx: you mean each entry should be a subarray? (i really try to get what you want to tell me ;) )
[16:32:19] <StephenLynx> no, each entry is a value in a subarray of the sale.
[16:32:22] <GothAlice> Instead of: {"1":"Belgium", "2":"WBA1N11050J155953", "3":"Test", "4":"", "5":"AppsOffer000", "6":"07-MAY-14", "7":"07-MAY-14", "8":"lifetime", "9":"", "10":"265"}
[16:32:56] <GothAlice> Have: {"data": ["Belgium", "WBA1N11050J155953", "Test", "", "AppsOffer000", "07-MAY-14", "07-MAY-14", "lifetime", "", "265"]}
[16:33:03] <StephenLynx> {id:"WBA1N11050J155953",data:["blah","asdasd"]}
[16:33:34] <StephenLynx> can't you REALLY name the fields, btw?
[16:33:49] <StephenLynx> the whole thing is meaningless and a code hell
[16:34:34] <StephenLynx> if you know which index represents which data, can't you use words instead of the position?
[16:34:53] <Cygn> StephenLynx: (using this value as id would be wrong since this is not the id)… but yes i could just make up names by myself for each column and associate the input data
[16:35:07] <Cygn> StephenLynx: it's not even distinct, that's what i mean.
[16:39:02] <agenteo> hi there, I remember querying a string field for its size but I can’t find how anymore… does anybody know how?
[16:43:19] <Cygn> StephenLynx: I mean i could just use only data: subarray as value, but where would be the improvement?
[16:43:40] <StephenLynx> don't you have anonymous data?
[16:44:51] <agenteo> never mind db.content_pieces.find( { $where: "this._id.length = 36" }
[16:46:39] <pamp> hi guys, its possible match a document when the difference for the others is that this one have an array field?
[16:47:13] <pamp> like {_id:123, a[2,3]} {_id, a:34}
[16:47:25] <pamp> i want to match the first example
[16:47:42] <pamp> some documents will be matched in millions
[16:48:36] <agenteo> I have now a more interesting question, can I check if a field is an ObjectId?
[16:55:54] <Cygn> StephenLynx: i do - if i get right what that acutally means
[16:56:17] <StephenLynx> if its anonymous you don't need a field name.
[16:56:35] <StephenLynx> just throw in a subarray and put the other data in named fields.
[16:56:56] <Cygn> i need it, the numbers are not random at all.
[16:57:57] <GothAlice> agenteo: Yes. $type, see: http://docs.mongodb.org/manual/reference/operator/query/type/
[16:58:15] <agenteo> awesome thanks GothAlice
[16:58:23] <GothAlice> agenteo: Be aware that it interacts with arrays in somewhat odd ways. (I.e. if the field being checked for ObjectId-ness is an array containing at least one ObjectId, it'll match, too.)
[16:58:47] <agenteo> I see thanks for the gotcha, I am lucky enough to check for a single field
[17:00:34] <StephenLynx> if they are not random, then its not anonymous
[17:00:46] <StephenLynx> if they are not anonymous, don't use a number, use a word.
[17:06:23] <Cygn> StephenLynx: Okay, you're actually right i will change this afterwards.
[17:07:34] <Waheedi> What is this about? rollback 2 FindCommonPoint
[17:07:43] <Waheedi> after rollback state it becomes fatal
[17:42:22] <Waheedi> How can i recover my primary DB from a fatal states
[17:42:25] <Waheedi> state*
[17:42:30] <Waheedi> now my secondary took over primary
[17:42:36] <Waheedi> but the primary not able to roll back
[17:49:02] <voidDotClass> Is there a mongo setting for where it looks for its data? i.e the path?
[17:49:35] <GothAlice> voidDotClass: http://docs.mongodb.org/manual/reference/configuration-options/#storage.dbPath
[17:50:53] <voidDotClass> GothAlice: which file contains this/
[17:51:14] <GothAlice> voidDotClass: Whichever configuration file you tell mongod to use on its command-line.
[17:51:24] <GothAlice> Typically /etc/mongod.conf or similar.
[18:09:43] <Siamaster> if I haven't stored a date for my object and want to use the date which is stored in the objectid, how can I do it?
[18:10:03] <Siamaster> lets say I have a collection of users
[18:10:17] <Siamaster> and I wan't to know which ones are created in the past 5 min
[18:10:49] <GothAlice> Siamaster: db.users.find({_id: {$gt: ObjectId.from_datetime(datetime.utcnow()-timedelta(minutes=5))})
[18:10:53] <GothAlice> That'll do it in PyMongo.
[18:11:12] <GothAlice> Other drivers should have similar "from_datetime" methods on ObjectId to let you construct time-specific IDs for use in range queries.
[18:12:06] <Siamaster> thanks
[18:13:20] <Siamaster> So I have to do it in application code then
[18:13:36] <GothAlice> Uhm, sorta.
[18:13:45] <GothAlice> All you're doing is providing valid values for comparison.
[18:14:05] <GothAlice> Like saying age: {$gt: 27} — 27 is "application code" in the same way that ObjectId is.
[18:15:20] <Waheedi> anyone can help me with my fatal replica set node
[18:15:26] <Siamaster> alright
[18:15:29] <Waheedi> its driving me crazy
[18:16:24] <GothAlice> Waheedi: Worst-case scenario in the event that your secondary safely promoted itself to a new primary: nuke the old (dead) primary and re-add it to the replica set as a new node. It'll re-replicate, join the set, then you can force an election if you care which node is primary.
[18:17:21] <GothAlice> (For reliability one should not require a certain node be primary for anything other than logistic reasons, i.e. to keep it in the same datacenter.)
[18:18:02] <Waheedi> GothAlice: the secondary got elected for primary, i don't really care which one is primary. The thing is the fatal state node is almost one month ahead of the current secondary (elected to primary node)
[18:18:10] <GothAlice> …
[18:18:23] <Waheedi> GothAlice: I ran mongo1(old primary) as single database
[18:18:32] <GothAlice> With delayed replication?
[18:18:51] <Waheedi> i have not configured any delayed replication
[18:18:57] <Waheedi> but it seems far more than delayed
[18:19:34] <cheeser> so the secondary that just got elected primary is a month behind your old primary?
[18:19:41] <Waheedi> exactly
[18:19:58] <Waheedi> lool
[18:20:00] <GothAlice> Hmm. Well, on the currently dead former primary you're going to want 2.1x the free space available on-disk than your data currently consumes, at a minimum, then try running mongod in --repair mode.
[18:21:02] <Waheedi> anyway without using the repair GothAlice?
[18:21:10] <Waheedi> its going to take couple of days
[18:21:27] <Waheedi> or 36 hours to be more precise
[18:21:38] <GothAlice> Waheedi: Yeah. "Do you have backups" means "we can't fix this". It may be time to start digging out the tape archives. ;) --repair is worth a shot (given the free space available to do it) but no guarantee in the situation you're in, with things in such a wonky state. Make a note, BTW, to add better monitoring of your system to catch things like growing replication lag in the future.
[18:22:04] <Waheedi> 100% regarding delayed replication
[18:22:27] <Waheedi> if i remove the rs.oplog
[18:22:42] <Waheedi> on the dead rs primary would it change anything
[18:23:00] <Waheedi> oplog.rs
[18:25:59] <GothAlice> While technically it shouldn't really hurt to nuke that from disk, you may have hanging associations within the MongoDB configuration or remaining dataset, so it's best to avoid tampering.
[18:26:09] <GothAlice> I.e. make a shapshot backup before --repair'ing, too.
[18:26:27] <GothAlice> Worst-case you _may_ be able to mongodump directly from the on-disk files.
[18:26:42] <GothAlice> --repair effectively dumps then restores.
[18:26:51] <GothAlice> (thus needing the free space.)
[18:27:08] <Waheedi> yeah its a cloud storage mounted to the device
[18:27:15] <GothAlice> T_T
[18:27:16] <Waheedi> almost 86% used
[18:27:21] <Waheedi> its fucked up man
[18:27:24] <GothAlice> EC2/EBS?
[18:27:29] <Waheedi> rackspace
[18:27:41] <Waheedi> block storage/ pretty fast though
[18:27:43] <GothAlice> Hmm. Haven't tried using Rackspace's block volumes yet.
[18:28:57] <Waheedi> What about the oplog.rs if i deleted it would it sync again with the replica set? if i re add this dead node?
[18:29:53] <GothAlice> Same comment from before: you may lose some important association which may cause the --repair to bail or other problems. In theory it's disposable.
[18:30:25] <Waheedi> argh
[18:30:35] <GothAlice> You've run into a split-brain problem, though. You have a dead primary newer than the current primary… is new data still being added?
[18:31:24] <Waheedi> i stopped new data from getting in temporary
[18:31:53] <Waheedi> i configured my apps to talk with the single dead node
[18:32:03] <Waheedi> and they are functioning well
[18:32:43] <Waheedi> no writes though!!
[18:34:57] <Waheedi> i readied the dead node to rs
[18:35:05] <Waheedi> now its saying "rollback 2 FindCommonPoint", in errmsg
[18:35:44] <Waheedi> i don't think it should try to find a common point with a month outdated node
[18:35:52] <Waheedi> or that is not whats happening?
[18:49:41] <Waheedi> Ok can I copy the /data/my_db.* files from the dead node to the secondary? the new primary
[18:50:06] <Waheedi> or maybe use rsync?
[19:19:30] <voidDotClass> in order to find stuff within a collection where foo column's value is bar, should i run the query {foo:bar} ?
[19:21:53] <StephenLynx> yes
[19:22:13] <StephenLynx> 'bar', except if bar is already define with the value.
[19:33:16] <voidDotClass> when i run that, i'm gettin "fields" : null , "options" : 0 , "readPreference" : { "mode" : "primary"} , "numSeen" : 0 , "numGetMores" : 0}}
[19:33:22] <voidDotClass> does that mean it didnt find any results?
[19:46:20] <voidDotClass> how do we determine where the mongo.conf file is
[19:50:15] <cheeser> mongo admin
[19:50:17] <cheeser> db.runCommand("getCmdLineOpts")
[19:50:21] <cheeser> like so
[20:09:02] <fewknow> blamo
[20:17:56] <Waheedi> GothAlice: Can i share what i'm planning to do?
[20:23:16] <voidDotClass> Am I doing something wrong here with how i'm using the java driver: https://gist.github.com/anonymous/ec0a3e84cedc68ccfa9d
[20:23:57] <voidDotClass> everything works upto the last one, where i do cursor.size(), it dies with the exception: com.mongodb.MongoTimeoutException: Timed out after 10000 ms while waiting to connect. Client view of cluster state is {type=Unknown, servers=[{address=127.0.0.1:27017, type=Unknown, state=Connecting, exception={com.mongodb.MongoException$Network: Exception opening the socket}, caused by {java.net.ConnectException: Connection refused}}]
[20:24:05] <voidDotClass> which is weird because i didnt' give it 127.0.0.1
[20:24:09] <voidDotClass> i gave it another ip
[20:27:51] <StephenLynx> by default it binds to 127.0.0.1
[20:27:57] <hashpuppy> can i just use mongodump for all my backups? the docs say to use lvcreate
[20:29:05] <voidDotClass> StephenLynx: how do i bind it to a different ip? i'm passing in my actual hostname when i create the client
[20:29:20] <StephenLynx> you must alter its config files.
[20:29:30] <StephenLynx> make sure to add authentication
[20:29:50] <voidDotClass> StephenLynx: which config files?
[20:29:57] <StephenLynx> mongo.conf I guess
[20:29:59] <voidDotClass> the java client or the mongo server?
[20:30:03] <StephenLynx> mongo server.
[20:30:09] <voidDotClass> the server is on a different server than my dev machine
[20:30:52] <StephenLynx> yeah, yo connect to it you will have to remove the binding to the local ip and add authentication.
[20:31:14] <StephenLynx> I mean, you could just remove the binding, but then anyone in the same network would be able to access it.
[20:31:15] <voidDotClass> i don't need auth as its on aws and i have the port open to my ip only, so no worries
[20:31:25] <StephenLynx> aws?
[20:31:28] <voidDotClass> amazon
[20:31:29] <StephenLynx> never used it
[20:31:37] <voidDotClass> yeah it lets you limit ports to a certain ip
[20:31:46] <voidDotClass> so what's the reason why its giving me this error
[20:31:47] <StephenLynx> so I dunno how its done there
[20:31:51] <StephenLynx> don't know
[20:31:54] <voidDotClass> its just a regular linux vm
[20:32:06] <StephenLynx> heh
[20:32:29] <voidDotClass> i'm able to connect to it if i shell or use the gui i have
[20:32:36] <voidDotClass> but if i try to connect via the java driver its giving me this error
[20:32:39] <StephenLynx> remotelly?
[20:32:45] <StephenLynx> then the problem is not the binding
[20:32:52] <StephenLynx> is in your code.
[20:32:56] <voidDotClass> yes, the mongo is on its own server, and i'm connecting to it remotely
[20:33:06] <voidDotClass> what might be the problem in it then
[20:33:14] <StephenLynx> don't know, never used java with mongo.
[20:33:33] <voidDotClass> i wish i didnt have to either, only reason i'm using it is to move the data off to cassandra :)
[20:33:35] <StephenLynx> by remotely you don't mean you log in with ssh and then use the cli client, right?
[20:33:39] <voidDotClass> no no
[20:33:41] <StephenLynx> ok
[20:33:47] <StephenLynx> then the problem is in the java code.
[20:34:00] <voidDotClass> i wish i knew what it was
[20:44:15] <voidDotClass> i'm following the api exactly as described, but its not working
[20:51:32] <voidDotClass> yeah, was a problem in my code..
[21:13:59] <cheeser> which was?
[21:14:31] <voidDotClass> the endpoint was not being set due to a weird race condition
[21:14:50] <voidDotClass> would've been nice if the client complained on having a null endpoint rather than silently converting it to 127.0.0.1
[21:15:02] <cheeser> not being set?
[21:20:53] <voidDotClass> basically i was doing private static MyClass instance = new MyClass() as the first line
[21:21:04] <voidDotClass> and a couple of lines below i was doing private static String ENDPOINT = ...
[21:21:12] <voidDotClass> so when the constructor was called, endpoint was null
[21:28:50] <cheeser> that's ... weird
[21:29:18] <voidDotClass> try it, its standard behavior
[21:36:50] <cheeser> show me what you've got. i don't have enough of a clear picture to recreate that.
[21:39:35] <voidDotClass> cheeser: https://gist.github.com/aliakhtar/61fa012097bd22cd5784
[21:39:57] <voidDotClass> wait
[21:40:14] <voidDotClass> check now
[21:45:09] <cheeser> yes. because you have an initializaiton loop there.
[21:46:09] <cheeser> was the fix to move the ENDPOINT declaration up one line?
[21:47:50] <voidDotClass> yeah
[21:51:20] <voidDotClass> just saying, would've been cool if the java driver of mongo didn't act like the php / javascript driver and threw up an exception on receiving null, instead of silently converting it to 127.0.0.1
[21:51:31] <voidDotClass> provide a no arg constructor for that instead of converting null
[21:51:56] <cheeser> file a jira
[21:52:45] <voidDotClass> nah, i'm just using mongo now to move my data to cassandra, other than that i shouldnt have to use mongo again
[23:06:40] <Waheedi> alright :) i fixed my dead node.
[23:06:53] <Waheedi> now things are back to normal replication is working again and syncing..
[23:07:08] <Waheedi> i had 500GB database so --repair option would take days
[23:07:35] <Waheedi> it seems the secondary that was elected to primary was the damaged one and not the primary that was giving the fatal status
[23:40:21] <Waheedi> http://pastie.org/9988529 this is my replicatioInfo
[23:40:29] <Waheedi> any idea why oplog is very large?
[23:43:36] <joannac> do you have a really huge disk?
[23:44:10] <joannac> wait, 15gb. ~300gb disk?
[23:44:16] <joannac> defaults to 5% of free disk
[23:44:30] <joannac> you probably don't need that much. resize it if you want
[23:53:11] <Waheedi> thanks joannac
[23:53:47] <Waheedi> Yes the db is almost 310gb its 309.802734375GB
[23:54:18] <Waheedi> Do you think 5GB would be enough
[23:57:27] <Waheedi> Does it have any performance implications? or its just more disk space?
[23:57:42] <joannac> Waheedi: oplog is about recovery
[23:58:00] <joannac> if you have network problems, how long would it take you to fix it?
[23:58:20] <joannac> if your secondary's disk died, how long would it take to get another one
[23:58:45] <Waheedi> few minutes
[23:58:48] <joannac> if it's inside the oplog window, then all you have to do is catch up
[23:58:53] <Waheedi> nowadays
[23:59:10] <joannac> if it's not, then you have to sync from scratch
[23:59:13] <Waheedi> not few maybe 30min
[23:59:32] <joannac> a few minutes? your disk dies at 3am and within 30 mins someone will go and replace a disk?
[23:59:43] <joannac> where do you get people like that and where can I get some? ;)
[23:59:56] <Waheedi> thats the beauty of cloud block storage :)